Hi,
just wanted to know if anyone else has similar issue with amdgpu crashes on Linux:
I’m using Arch, up to date as of 29.04.2025, kernel 6.14.4-arch1-1 on FW 13 with HX 370.
Usually, while using Firefox (especially when starting videos) or Kitty terminal, the amdgpu module may crash at random moment. Sometimes the affected process continue to work after amdgpu resumed, sometimes process crashes as well.
Typical crash log:
amdgpu 0000:c1:00.0: amdgpu: Dumping IP State
amdgpu 0000:c1:00.0: amdgpu: Dumping IP State Completed
amdgpu 0000:c1:00.0: amdgpu: ring vcn_unified_0 timeout, signaled seq=77533, emitted seq=77535
amdgpu 0000:c1:00.0: amdgpu: Process information: process RDD Process pid 61318 thread firefox:cs0 pid 62586
amdgpu 0000:c1:00.0: amdgpu: GPU reset begin!
amdgpu 0000:c1:00.0: amdgpu: MODE2 reset
amdgpu 0000:c1:00.0: amdgpu: GPU reset succeeded, trying to resume
[drm] PCIE GART of 512M enabled (table at 0x0000008001700000).
amdgpu 0000:c1:00.0: amdgpu: SMU is resuming...
amdgpu 0000:c1:00.0: amdgpu: SMU is resumed successfully!
[drm] DMUB hardware initialized: version=0x09001B00
amdgpu 0000:c1:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
amdgpu 0000:c1:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
amdgpu 0000:c1:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
amdgpu 0000:c1:00.0: amdgpu: ring jpeg_dec_0 uses VM inv eng 1 on hub 8
amdgpu 0000:c1:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0
amdgpu 0000:c1:00.0: amdgpu: ring vpe uses VM inv eng 4 on hub 8
amdgpu 0000:c1:00.0: amdgpu: GPU reset(3) succeeded!
Also during load amdgpu module says Optional firmware ... was not found:
amdgpu: ATOM BIOS: 113-STRIXEMU-001
amdgpu 0000:c1:00.0: amdgpu: VPE: collaborate mode false
amdgpu 0000:c1:00.0: amdgpu: [drm] Optional firmware "amdgpu/isp_4_1_0.bin" was not found
amdgpu 0000:c1:00.0: amdgpu: Trusted Memory Zone (TMZ) feature disabled as experimental (default)
amdgpu 0000:c1:00.0: amdgpu: VRAM: 512M 0x0000008000000000 - 0x000000801FFFFFFF (512M used)
amdgpu 0000:c1:00.0: amdgpu: GART: 512M 0x00007FFF00000000 - 0x00007FFF1FFFFFFF
[drm] amdgpu: 512M of VRAM memory ready
[drm] amdgpu: 31787M of GTT memory ready.
Probably unrelated problem, but I’m also getting USB C errors from time to time:
I confirm I have both problems as stated in my posts. For gpu issue we need to wait for new kernel and/or firmware. For PD issue we probably need new BIOS/firmware update from Framework. Hardware is still new, but software will catch up pretty soon.
I’m using an extra monitor connected to an HDMI expansion card
amdgpu crashes, then resumes. This causes the extra monitor goes black, the system to become unresponsive, and the audio to still keep playing.
My crash log:
may 06 17:00:48 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: ring vcn_unified_0 timeout, signaled seq=206212>
may 06 17:00:48 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: Process information: process RDD Process pid 34>
may 06 17:00:48 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: GPU reset begin!
may 06 17:00:48 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: Dumping IP State
may 06 17:00:48 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: Dumping IP State Completed
may 06 17:00:48 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: MODE2 reset
may 06 17:00:48 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: GPU reset succeeded, trying to resume
may 06 17:00:48 cheshire kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008001700000).
may 06 17:00:48 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: SMU is resuming...
may 06 17:00:48 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: SMU is resumed successfully!
may 06 17:00:48 cheshire kernel: [drm] DMUB hardware initialized: version=0x0000D500
may 06 17:00:49 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
may 06 17:00:49 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
may 06 17:00:49 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
may 06 17:00:49 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
may 06 17:00:49 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
may 06 17:00:49 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
may 06 17:00:49 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
may 06 17:00:49 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
may 06 17:00:49 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
may 06 17:00:49 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
may 06 17:00:49 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
may 06 17:00:49 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: ring jpeg_dec_0 uses VM inv eng 1 on hub 8
may 06 17:00:49 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0
may 06 17:00:49 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: ring vpe uses VM inv eng 4 on hub 8
may 06 17:00:49 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: recover vram bo from shadow start
may 06 17:00:49 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: recover vram bo from shadow done
may 06 17:00:49 cheshire kernel: amdgpu 0000:c1:00.0: amdgpu: GPU reset(1) succeeded!