[SOLVED] Debian 12 on Laptop 13, Ryzen 7640U: GPU hangs in some games

I just got a Framework Laptop 13 with an AMD Ryzen 7640U, and installed Debian 12 with KDE Plasma. I’ve noticed an intermittent issue when playing video games on my Framework Laptop 13, where after some time playing, the display will lock up. After a while, if it’s able to recover at all, I will be booted back to the login screen. This happens on both Wayland and X11.

When this happens seems to vary - sometimes I am able to play a game normally, sometimes it crashes right on start up, sometimes it crashes a minute into playing the game.

journalctl from a boot in which this occurred - I’m noticing a lot of messages about IO page faults and a GPU reset.

I am already on the latest BIOS, and I have tried upgrading the Linux kernel (using Debian Backports, to 6.6.13). The problem seemed to improve, but not entirely go away.

Is this a software or a hardware issue? Any ideas for fixing it?

Install the updated upstream GPU firmware if you haven’t already. There is an ancient snapshot in Debian.

Okay, I just tried two different ways of doing this: first I installed the firmware-amd-graphics package from Debian trixie, then I tried, based on the suggestions in [RESPONDED] Brief report: Debian 12 bookworm working on Framework 13 AMD Ryzen 7040 Series - #12 by Mario_Limonciello , copying the amdgpu firmware in the upstream linux-firmware repo into /lib/firmware/amdgpu, then running update-initramfs.

In both cases, the GPU still hung just a minute or so into playing a game.

Looks like this bug may be related: Framework 13 AMD / Darktable / amdgpu crash (#3245) · Issues · drm / amd · GitLab

And here’s dmesg, with more detail than the journalctl I posted earlier: tchncs

This looks like it’s likely a mesa bug. Can you reproduce with updated mesa?

1 Like

How should I install the updated version? The Debian package for mesa is split into multiple packages, and the latest version is likely to depend on newer libraries than are available in bookworm (in which case I risk creating a FrankenDebian.)

I did try to replicate this on a live USB of Fedora Games Edition and could not. I’m not sure how much that says, since this issue was already intermittent, and the games there are generally less graphically intensive than the ones causing crashes here, but that may be a sign that a newer mesa would fix this issue.

I guess apt pinning and pulling stuff from testing or unstable?

1 Like

alright, pinned Debian trixie below bookworm and bookworm-backports and installed all packages with “mesa” in the name from trixie along with their dependencies. A bit worried this will cause issues down the road, but oh well.

Rebooted, checked a couple places where the bug consistently shows up, and… no more crashes! Seems you were right. Though I’ll wait a day or so before marking this resolved because like I said, it’s intermittent and might pop up again.

Nevermind, got a graphics crash to happen in a game again. Seems this is not resolved yet.

This time around, kwin_wayland segfaulted: drkonqi backtrace

And dmesg looks a bit different. At the beginning of that log, there are some logs from another graphical bug I experienced, where after I changed the window decoration settings in plasma, the screen started flashing white.

Edit: After setting amdgpu.sg_display=0 in kernel parameters, as per [TRACKING] Graphical corruption in Fedora 39 (AMD 3.03 BIOS), which I think may also be related to my issue, I could not replicate either bug in the same place. Will update if it occurs again.

Can’t seem to edit my original post to say “RESOLVED”, but it’s been a while since I set amdgpu.sg_display = 0 and game optimized mode, and I have not experienced any GPU hangs since. There are still some mild graphical glitches, but nothing that gets in the way of regular usage.

After some further A/B testing, it seems that resolving this problem requires both a new version of Mesa AND the kernel parameter amdgpu.sg_display = 0.

Since the mixing of packages from Debian Bookworm and Trixie was causing certain issues with apt, just as I was worried about, I ended up upgrading wholesale to Trixie (testing). If I see any issues with Trixie between now and when it becomes stable, which I hope I won’t, I’ll move over to one of the officially recommended distros.