And now it just happened on 6.12.10 which is from mid-January. Specifically, after resuming from sleep. I also have two external monitors attached via TB4, so my setup pushes the limits of the chipset but that just means I trip on these things more often than people using only the laptop panel.
It clearly happens more often on 6.13.x versions but I’m not sure how far to go back to find a kernel version that works reliably. I’m sure this wasn’t happening in the fall.
Reading this and several other threads about this, I believe the issue is with Mesa-25, not the Kernel at all. I’m running Fedora 41, not Tumbleweed; however after downgrading to Mesa-24 I haven’t seen issues. But I also haven’t stressed the system enough to call it fixed.
As best I know there is no simple reproduction reported to the Mesa project at this time.
Thanks, Christopher. I disabled the various updates-testing repos and then sudo dnf distro-sync --allowerasing to get back to Mesa 24. I’ll let you know how it goes.
I am yet another Tumbleweeder using Gnome+Wayland, also with a AMD Ryzen™ 5 7640U laptop. Thanks for the tip about kernel-longterm; I just installed it.
FYI, here are a couple of openSUSE bugzilla reports that I’ve been following as (perhaps) being related to our problem:
1238361 - Several second screen freezes after suspend/resume following an upgrade to 20250302 - My original openSUSE report. It was marked as a duplicate of 1234732. I’m going to add a comment to question that decision.
I notice that BIOS 3.07 has recently been released. Has anyone seen if it resolves any of this?
I installed 3.07 a few days ago and it didn’t change a thing. kernel-longterm is the only fix for me, at least it seems to be that way so far (zero freezes).
Kernel-longterm has also resolved the issue for me, at least for the past 24 hours. Also, the openSUSE bugzilla reports I mentioned above have been resolved without fixing our issue, so I’ve submitted a new one, 1239657 - Garbling screen artifacts after suspend/resume, specifically for what we’re seeing. Please add any additional information you may have. (It’ll show there are more than one person having the problem.)
I think I’ll hold off on 3.07 for now. No reason to roll too many dice at the same time.
I’m yet another FW13 AMD running Fedora 41.
After some testing, I’m sure the problem is caused by mesa 25, and not necessarely correlated to firefox.
I’ve tested a number of times upgrading and downgrading between mesa 24.x and mesa 25, so far every time i downgrade all the artifacts appearing on screen disappeard. I’ve dnf versionlock mesa-dri-drivers to 24.2.4-1.fc41 for now and seems to have fixed the problem
I’m getting graphics lock-ups that halt the whole system till it auto resets.
[11371.628554] amdgpu 0000:c1:00.0: amdgpu: Dumping IP State
[11371.630858] amdgpu 0000:c1:00.0: amdgpu: Dumping IP State Completed
[11371.640927] amdgpu 0000:c1:00.0: amdgpu: ring gfx_0.0.0 timeout, signaled seq=2396193, emitted seq=2396195
[11371.640932] amdgpu 0000:c1:00.0: amdgpu: Process information: process Diablo IV.exe pid 25118 thread vkd3d_queue pid 25189
[11371.640934] amdgpu 0000:c1:00.0: amdgpu: Starting gfx_0.0.0 ring reset
[11373.644731] amdgpu 0000:c1:00.0: amdgpu: MES failed to respond to msg=RESET
[11373.644738] [drm:amdgpu_mes_reset_legacy_queue [amdgpu]] *ERROR* failed to reset legacy queue
[11373.644838] amdgpu 0000:c1:00.0: amdgpu: Ring gfx_0.0.0 reset failure
[11373.644840] amdgpu 0000:c1:00.0: amdgpu: GPU reset begin!
[11375.703519] amdgpu 0000:c1:00.0: amdgpu: MES failed to respond to msg=REMOVE_QUEUE
[11375.703533] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[11375.974835] [drm:gfx_v11_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
[11375.976534] amdgpu 0000:c1:00.0: amdgpu: MODE2 reset
[11376.009082] amdgpu 0000:c1:00.0: amdgpu: GPU reset succeeded, trying to resume
[11376.009762] [drm] PCIE GART of 512M enabled (table at 0x00000080FFD00000).
[11376.009859] amdgpu 0000:c1:00.0: amdgpu: SMU is resuming...
[11376.011934] amdgpu 0000:c1:00.0: amdgpu: SMU is resumed successfully!
[11376.019220] [drm] DMUB hardware initialized: version=0x08004800
[11376.344978] amdgpu 0000:c1:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[11376.344987] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[11376.344990] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[11376.344993] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
[11376.344995] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
[11376.344997] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
[11376.344999] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
[11376.345002] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
[11376.345004] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
[11376.345006] amdgpu 0000:c1:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[11376.345009] amdgpu 0000:c1:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
[11376.345011] amdgpu 0000:c1:00.0: amdgpu: ring jpeg_dec uses VM inv eng 1 on hub 8
[11376.345014] amdgpu 0000:c1:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0
[11376.347522] amdgpu 0000:c1:00.0: amdgpu: GPU reset(2) succeeded!
[11384.960509] usb 1-4: reset full-speed USB device number 3 using xhci_hcd
[11385.232465] usb 1-4: reset full-speed USB device number 3 using xhci_hcd
[11407.601853] amdgpu 0000:c1:00.0: amdgpu: VM memory stats for proc Diablo IV.exe(25189) task vkd3d_queue(25118) is non-zero when fini
[12706.265489] perf: interrupt took too long (2503 > 2500), lowering kernel.perf_event_max_sample_rate to 79750
[14498.283304] perf: interrupt took too long (3134 > 3128), lowering kernel.perf_event_max_sample_rate to 63750
[15648.482345] ucsi_acpi USBC000:00: unknown error 256
[15648.482352] ucsi_acpi USBC000:00: GET_CABLE_PROPERTY failed (-5)
[15780.063892] usb 1-4: reset full-speed USB device number 3 using xhci_hcd
[15780.351902] usb 1-4: reset full-speed USB device number 3 using xhci_hcd
[18752.454834] perf: interrupt took too long (3926 > 3917), lowering kernel.perf_event_max_sample_rate to 50750
[22363.849334] amdgpu 0000:c1:00.0: amdgpu: Dumping IP State
[22363.852200] amdgpu 0000:c1:00.0: amdgpu: Dumping IP State Completed
[22363.862251] amdgpu 0000:c1:00.0: amdgpu: ring gfx_0.0.0 timeout, signaled seq=13023333, emitted seq=13023335
[22363.862256] amdgpu 0000:c1:00.0: amdgpu: Process information: process Diablo IV.exe pid 107761 thread vkd3d_queue pid 107873
[22363.862259] amdgpu 0000:c1:00.0: amdgpu: Starting gfx_0.0.0 ring reset
[22365.866087] amdgpu 0000:c1:00.0: amdgpu: MES failed to respond to msg=RESET
[22365.866093] [drm:amdgpu_mes_reset_legacy_queue [amdgpu]] *ERROR* failed to reset legacy queue
[22365.866199] amdgpu 0000:c1:00.0: amdgpu: Ring gfx_0.0.0 reset failure
[22365.866202] amdgpu 0000:c1:00.0: amdgpu: GPU reset begin!
[22367.925667] amdgpu 0000:c1:00.0: amdgpu: MES failed to respond to msg=REMOVE_QUEUE
[22367.925683] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[22368.196607] [drm:gfx_v11_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
[22368.198294] amdgpu 0000:c1:00.0: amdgpu: MODE2 reset
[22368.231906] amdgpu 0000:c1:00.0: amdgpu: GPU reset succeeded, trying to resume
[22368.232606] [drm] PCIE GART of 512M enabled (table at 0x00000080FFD00000).
[22368.232716] [drm] VRAM is lost due to GPU reset!
[22368.232723] amdgpu 0000:c1:00.0: amdgpu: SMU is resuming...
[22368.234736] amdgpu 0000:c1:00.0: amdgpu: SMU is resumed successfully!
[22368.242021] [drm] DMUB hardware initialized: version=0x08004800
[22368.567696] amdgpu 0000:c1:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[22368.567703] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[22368.567706] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[22368.567708] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
[22368.567709] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
[22368.567711] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
[22368.567712] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
[22368.567714] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
[22368.567716] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
[22368.567717] amdgpu 0000:c1:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[22368.567719] amdgpu 0000:c1:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
[22368.567721] amdgpu 0000:c1:00.0: amdgpu: ring jpeg_dec uses VM inv eng 1 on hub 8
[22368.567723] amdgpu 0000:c1:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0
[22368.569927] amdgpu 0000:c1:00.0: amdgpu: GPU reset(4) succeeded!
This has happened a number of times in the past while just browsing. I’ve had it pretty consistently happen while playing Diablo IV. It seems able to recover when playing a game, but is never recoverable when browsing with Firefox.
Not running Flatpaks, as mentioned early on in the thread.
OS: Debian Sid, XFCE (X11)
Running Linux 6.14.0-rc7
With Framework Laptop 13 - AMD Ryzen 7 7840U
Full kernal perams: amdgpu.sg_display=0 amdgpu.dcdebugmask=0x10 usbcore.autosuspend=-1 loglevel=4 i8042.unlock=1
Running Fedora Silverblue 41 on a Framework Laptop 13 - AMD Ryzen 7 7840U with the 2.8K display and was experiencing these graphical artifacts and GPU crashes, most of the time I would just get kicked back to GDM when it crashed.
I downgraded from Mesa 25.0.1 to Mesa 24.2.4 and that cleared up all the problems. I noticed Mesa 25.0.2 came out yesterday so I hope these issues are cleared up when it lands in Fedora.
I’ve been having issues with amdgpu acting up for the better part of the last year, since around kernel 6.9. The solution I’ve been successfully using since February has been putting amdgpu.dcdebugmask=0x12 in the kernel command line. Right now I’m running 6.13.7 with mesa 25 without any issues. Good to hear though that 6.14 might be coming with an actual solution.
I just got Linux 6.14 on Arch Linux
I’ll do some experiments
UPDATE: I rebooted with mesa 25.0.2 and withing 1-2 hours the problem mentioned in this thread appeared. So Mesa 25 is still a no-go even with Linux 6.14…
Specifically this time it appeared when I opened a video with mpv. Everything froze except the cursor for 7 seconds, and sure enough when I checked dmesg I saw adgpu page fault in process mpv spammed.
My hunch is that this bug appears when playing video. Either firefox or mpv both will find themselves playing video.
The good news is that people seem to have located the commit in mesa that causes the issue
openSUSE seems to be adding a patch to its builds to revert it.
Is that the same issue? They don’t mention page faults there, only the artifacts
UPDATE 2: Seems like it isn’t nesessarilty video specific, since on the thread one person got in on kwin_wayland. And yup he reports the page faults like we see here, so it is the same issue. Arch could also adopt the revert patch tbh, but according to its philosphy it might wait to see it upstreamed first.
–
What remains to be seen is if I’ll get artifacts and freezes from PSR-SU ( have removed dcdebugmask from cmdline). I think this is a seperate issue from the one mentioned in this thread but still in the amdgpu umbrella.
UPDATE 3: 3 days in, it seems like dcdebugmask might not be needed anymore in 6.14. Will continue testing.
Great to hear about the Mesa progress. Regarding kernel 6.14, a patch has landed temporarily disabling PSR for eDP displays (essentially what the debug flag does), so the lack of need for kernel argument is sadly not indicative of a fix. Fortunately this still means the bug is getting attention, so we might be getting an actual fix soon.