Framework 13 is crashing or freezing up with newer amdgpu on Debian 12

Hello, I’m using the Framework 13 AMD with Debian 12 (Bookworm). I have followed instructions to update the AMD drivers manually.

I’ve just recently been getting weird crashes that happens today. Can anyone help me figure out what this is about?

Dec 16 03:35:54 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: ring gfx_0.0.0 timeout, signaled seq=97165, emitted seq=97167
Dec 16 03:35:54 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: Process information: process firefox-esr pid 3408 thread firefox-es:cs0 pid 3412
Dec 16 03:35:54 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: GPU reset begin!
Dec 16 03:35:57 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: MES failed to respond to msg=REMOVE_QUEUE
Dec 16 03:35:57 lilfoot kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
Dec 16 03:35:57 lilfoot kernel: [drm:gfx_v11_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
Dec 16 03:35:57 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: Dumping IP State
Dec 16 03:35:57 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: Dumping IP State Completed
Dec 16 03:35:57 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: MODE2 reset
Dec 16 03:35:57 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: GPU reset succeeded, trying to resume
Dec 16 03:35:57 lilfoot kernel: [drm] PCIE GART of 512M enabled (table at 0x000000801FD00000).
Dec 16 03:35:57 lilfoot kernel: [drm] VRAM is lost due to GPU reset!
Dec 16 03:35:57 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: SMU is resuming...
Dec 16 03:35:57 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: SMU is resumed successfully!
Dec 16 03:35:57 lilfoot kernel: [drm] DMUB hardware initialized: version=0x08003D00
Dec 16 03:35:57 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
Dec 16 03:35:57 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
Dec 16 03:35:57 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0                                                                           
Dec 16 03:35:57 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0                                                                           
Dec 16 03:35:57 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0                                                                           
Dec 16 03:35:57 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0                                                                           
Dec 16 03:35:57 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0                                                                           
Dec 16 03:35:57 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0                                                                          
Dec 16 03:35:57 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0                                                                          
Dec 16 03:35:57 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0                                                                               
Dec 16 03:35:57 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8                                                                        
Dec 16 03:35:57 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: ring jpeg_dec uses VM inv eng 1 on hub 8                                                                             
Dec 16 03:35:57 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0                                                                       
Dec 16 03:35:57 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: recover vram bo from shadow start                                                                                    
Dec 16 03:35:57 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: recover vram bo from shadow done                                                                                     
Dec 16 03:35:57 lilfoot kernel: amdgpu 0000:c1:00.0: amdgpu: GPU reset(2) succeeded!                                                                                              
Dec 16 03:35:57 lilfoot kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!                                                                          
Dec 16 03:35:57 lilfoot org.a11y.Bus[3556]: X connection to :0 broken (explicit kill or server shutdown).                                                                         
Dec 16 03:35:57 lilfoot systemd[1]: run-user-1000-doc.mount: Deactivated successfully.    

Will add more details if requested. Don’t know how often it is happening, but just happened at least twice for some reason.

info which kernel version you are using and whether you tried other versions will definitely be helpful :wink:

I’m going to guess that you’ve got Kernel 6.1 with linux-image-6.1.0-27-amd64 being most recent. The firmware-amd-graphics in Debian 12 Bookworm is 20230210-5. ISTR there’s a thread saying Kernel 6.8 and more-recent releases from the firmware blob store is needed for your Framework 13 AMD.

(My Framework 13 AMD is happy running Debian 13 Trixie. Please note that Trixie is pre-release and the KDE Plasma packages have not yet finished updating to Plasma 6.2, so there may be breakage.)

K3n.

1 Like