I discovered a very strange problem with my AMD Framework Laptop 13 after updating to kernel 6.10 recently. If I visit the framework website (specifically this page, and I’m also able to reproduce it with a stored offline version of this page, in case the online version changes) in Firefox, the whole laptop freezes. The screen turns black, and comes back a second later, the mouse still kinda moves, but it’s otherwise frozen and the screen sometimes flickers, but I’m most of the time still able to reboot it with a keyboard shortcut or ssh. This problem is reproducible, as soon as I load this website Firefox, everything freezes basically instantly. Other websites work fine (or at least I didn’t find another website yet which causes problems). The problematic website works fine when I use chrome, but crashes both in Firefox Nightly (version 131) and Firefox ESR (version 115). If I boot kernel 6.9 again, everything works fine, even in Firefox.
I’m using Gentoo Linux with wayland/sway.
Tested broken kernel versions: 6.10.2 and 6.10.3
Tested working kernel versions: 6.9.10 and 6.9.12
dmesg output of one of the crashes:
[ 78.882373] gmc_v11_0_process_interrupt: 35 callbacks suppressed
[ 78.882380] amdgpu 0000:c1:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32782)
[ 78.882388] amdgpu 0000:c1:00.0: amdgpu: in process RDD Process pid 3923 thread browser {b:cs0 pid 3954)
[ 78.882391] amdgpu 0000:c1:00.0: amdgpu: in page starting at address 0x000080011ee4e000 from client 18
[ 78.882395] amdgpu 0000:c1:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00103A11
[ 78.882397] amdgpu 0000:c1:00.0: amdgpu: Faulty UTCL2 client ID: unknown (0x1d)
[ 78.882400] amdgpu 0000:c1:00.0: amdgpu: MORE_FAULTS: 0x1
[ 78.882402] amdgpu 0000:c1:00.0: amdgpu: WALKER_ERROR: 0x0
[ 78.882404] amdgpu 0000:c1:00.0: amdgpu: PERMISSION_FAULTS: 0x1
[ 78.882406] amdgpu 0000:c1:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 78.882407] amdgpu 0000:c1:00.0: amdgpu: RW: 0x0
[ 78.882410] amdgpu 0000:c1:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32782)
[ 78.882413] amdgpu 0000:c1:00.0: amdgpu: in process RDD Process pid 3923 thread browser {b:cs0 pid 3954)
[ 78.882415] amdgpu 0000:c1:00.0: amdgpu: in page starting at address 0x000080011ed5a000 from client 18
[ 78.882417] amdgpu 0000:c1:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 78.882419] amdgpu 0000:c1:00.0: amdgpu: Faulty UTCL2 client ID: VMC (0x0)
[ 78.882421] amdgpu 0000:c1:00.0: amdgpu: MORE_FAULTS: 0x0
[ 78.882422] amdgpu 0000:c1:00.0: amdgpu: WALKER_ERROR: 0x0
[ 78.882424] amdgpu 0000:c1:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 78.882425] amdgpu 0000:c1:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 78.882427] amdgpu 0000:c1:00.0: amdgpu: RW: 0x0
[ 78.882684] amdgpu 0000:c1:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32782)
[ 78.882688] amdgpu 0000:c1:00.0: amdgpu: in process RDD Process pid 3923 thread browser {b:cs0 pid 3954)
[ 78.882690] amdgpu 0000:c1:00.0: amdgpu: in page starting at address 0x000080011ed5a000 from client 18
[ 78.882693] amdgpu 0000:c1:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00103A11
[ 78.882694] amdgpu 0000:c1:00.0: amdgpu: Faulty UTCL2 client ID: unknown (0x1d)
[ 78.882696] amdgpu 0000:c1:00.0: amdgpu: MORE_FAULTS: 0x1
[ 78.882698] amdgpu 0000:c1:00.0: amdgpu: WALKER_ERROR: 0x0
[ 78.882699] amdgpu 0000:c1:00.0: amdgpu: PERMISSION_FAULTS: 0x1
[ 78.882701] amdgpu 0000:c1:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 78.882703] amdgpu 0000:c1:00.0: amdgpu: RW: 0x0
[ 78.882706] amdgpu 0000:c1:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32782)
[ 78.882708] amdgpu 0000:c1:00.0: amdgpu: in process RDD Process pid 3923 thread browser {b:cs0 pid 3954)
[ 78.882711] amdgpu 0000:c1:00.0: amdgpu: in page starting at address 0x000080011ee4e000 from client 18
[ 78.882713] amdgpu 0000:c1:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 78.882715] amdgpu 0000:c1:00.0: amdgpu: Faulty UTCL2 client ID: VMC (0x0)
[ 78.882717] amdgpu 0000:c1:00.0: amdgpu: MORE_FAULTS: 0x0
[ 78.882720] amdgpu 0000:c1:00.0: amdgpu: WALKER_ERROR: 0x0
[ 78.882722] amdgpu 0000:c1:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 78.882724] amdgpu 0000:c1:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 78.882725] amdgpu 0000:c1:00.0: amdgpu: RW: 0x0
[ 88.947442] [drm:amdgpu_job_timedout] *ERROR* ring vcn_unified_0 timeout, signaled seq=549, emitted seq=552
[ 88.947467] [drm:amdgpu_job_timedout] *ERROR* Process information: process RDD Process pid 3923 thread browser {b:cs0 pid 3954
[ 88.947478] amdgpu 0000:c1:00.0: amdgpu: GPU reset begin!
[ 89.215806] [drm] Register(0) [regUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002n
[ 89.429945] [drm] Register(0) [regUVD_RB_RPTR] failed to reach value 0x000000c0 != 0x00000040n
[ 89.645651] [drm] Register(0) [regUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002n
[ 89.652502] amdgpu 0000:c1:00.0: amdgpu: MODE2 reset
[ 89.690199] amdgpu 0000:c1:00.0: amdgpu: GPU reset succeeded, trying to resume
[ 89.690938] [drm] PCIE GART of 512M enabled (table at 0x000000801FD00000).
[ 89.691120] [drm] VRAM is lost due to GPU reset!
[ 89.691123] amdgpu 0000:c1:00.0: amdgpu: SMU is resuming...
[ 89.693212] amdgpu 0000:c1:00.0: amdgpu: SMU is resumed successfully!
[ 89.694596] [drm] DMUB hardware initialized: version=0x08003D00
[ 90.053993] [drm] kiq ring mec 3 pipe 1 q 0
[ 90.310129] [drm] Register(0) [regUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002n
[ 90.310460] amdgpu 0000:c1:00.0: [drm:jpeg_v4_0_hw_init] JPEG decode initialized successfully.
[ 90.310758] amdgpu 0000:c1:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[ 90.310761] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[ 90.310764] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[ 90.310767] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
[ 90.310769] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
[ 90.310771] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
[ 90.310773] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
[ 90.310775] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
[ 90.310778] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
[ 90.310780] amdgpu 0000:c1:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[ 90.310782] amdgpu 0000:c1:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
[ 90.310784] amdgpu 0000:c1:00.0: amdgpu: ring jpeg_dec uses VM inv eng 1 on hub 8
[ 90.310786] amdgpu 0000:c1:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0
[ 90.313402] amdgpu 0000:c1:00.0: amdgpu: recover vram bo from shadow start
[ 90.313406] amdgpu 0000:c1:00.0: amdgpu: recover vram bo from shadow done
[ 90.313428] amdgpu 0000:c1:00.0: amdgpu: GPU reset(1) succeeded!
[ 90.919743] browser {b:cs0[3954]: segfault at 0 ip 000056141f3c730d sp 00007fefaa6d6910 error 6 in firefox-bin[df30d,56141f30a000+107000] likely on CPU 5 (core 2, socket 0)
[ 90.919765] Code: e5 41 56 53 48 89 fb 4c 8b 35 9f b5 04 00 49 8b 36 e8 77 8e 04 00 49 8b 36 bf 0a 00 00 00 e8 ba 8f 04 00 48 89 1d 9b eb 04 00 <c7> 04 25 00 00 00 00 23 00 00 00 e8 03 00 00 00 cc cc cc 55 48 89
[ 91.561359] [drm] Register(0) [regUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000000n
[ 91.799086] [drm] Register(0) [regUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000000n
[ 104.812154] [drm:amdgpu_job_timedout] *ERROR* ring vcn_unified_0 timeout, signaled seq=558, emitted seq=560
[ 104.812179] [drm:amdgpu_job_timedout] *ERROR* Process information: process RDD Process pid 4340 thread browser {2:cs0 pid 4354
[ 104.812192] amdgpu 0000:c1:00.0: amdgpu: GPU reset begin!
[ 105.071661] [drm] Register(0) [regUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002n
[ 105.287239] [drm] Register(0) [regUVD_RB_RPTR] failed to reach value 0x000000c0 != 0x00000040n
[ 105.502218] [drm] Register(0) [regUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002n
[ 105.508940] amdgpu 0000:c1:00.0: amdgpu: MODE2 reset
[ 105.547009] amdgpu 0000:c1:00.0: amdgpu: GPU reset succeeded, trying to resume
[ 105.547735] [drm] PCIE GART of 512M enabled (table at 0x000000801FD00000).
[ 105.547825] [drm] VRAM is lost due to GPU reset!
[ 105.547830] amdgpu 0000:c1:00.0: amdgpu: SMU is resuming...
[ 105.550747] amdgpu 0000:c1:00.0: amdgpu: SMU is resumed successfully!
[ 105.552113] [drm] DMUB hardware initialized: version=0x08003D00
[ 105.918756] [drm] kiq ring mec 3 pipe 1 q 0
[ 106.162726] [drm] Register(0) [regUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002n
[ 106.162819] amdgpu 0000:c1:00.0: [drm:jpeg_v4_0_hw_init] JPEG decode initialized successfully.
[ 106.163066] amdgpu 0000:c1:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[ 106.163069] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[ 106.163072] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[ 106.163074] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
[ 106.163076] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
[ 106.163078] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
[ 106.163079] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
[ 106.163081] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
[ 106.163083] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
[ 106.163085] amdgpu 0000:c1:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[ 106.163087] amdgpu 0000:c1:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
[ 106.163091] amdgpu 0000:c1:00.0: amdgpu: ring jpeg_dec uses VM inv eng 1 on hub 8
[ 106.163094] amdgpu 0000:c1:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0
[ 106.164705] amdgpu 0000:c1:00.0: amdgpu: recover vram bo from shadow start
[ 106.164707] amdgpu 0000:c1:00.0: amdgpu: recover vram bo from shadow done
[ 106.164723] amdgpu 0000:c1:00.0: amdgpu: GPU reset(2) succeeded!
[ 106.220971] [drm:amdgpu_cs_ioctl] *ERROR* Failed to initialize parser -125!
[ 106.602075] browser {2:cs0[4354]: segfault at 0 ip 000056141f3c730d sp 00007fefaa75c940 error 6 in firefox-bin[df30d,56141f30a000+107000] likely on CPU 6 (core 3, socket 0)
[ 106.602096] Code: e5 41 56 53 48 89 fb 4c 8b 35 9f b5 04 00 49 8b 36 e8 77 8e 04 00 49 8b 36 bf 0a 00 00 00 e8 ba 8f 04 00 48 89 1d 9b eb 04 00 <c7> 04 25 00 00 00 00 23 00 00 00 e8 03 00 00 00 cc cc cc 55 48 89
[ 107.430840] [drm] Register(0) [regUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000000n
Once it also caused a kernel panic and I wasn’t able to shutdown cleanly anymore, but most of the crashes look similar to the one above: Framework kernel 6.10 firefox kernel panic · GitHub
Sometimes firefox also creates a crash-report: https://crash-stats.mozilla.org/report/index/e001cfa1-6cd1-464d-816f-607a00240808
But I’m not sure if that’s showing the real cause, or if that’s just firefox crashing because the whole GPU just crashed?
I have no idea where to start debugging this. Is it kernel problem? Because it started when I upgraded the kernel to 6.10. Or is it a firefox problem? Because it works in chrome, but firefox also works if I downgrade the kernel to 6.9 again, so doesn’t really look like a firefox problem? But since the problem is reproducible, I can also test different things or provide different logs.
Also I find it kinda ironic, that it’s the framework website itself which crashes my framework laptop.