GPU drops to 1-4 FPS after playing games a short while

Which Linux distro are you using?
Fedora 40

(If rolling release, last date updated?)

Which kernel are you using?
Linux 6.10.6-200.fc40.x86_64 x86_64

Which BIOS version are you using?
03.03

Which Framework Laptop 16 model are you using? (AMD Ryzen™ 7040 Series)
AMD Ryzen™ 7040 Series

So I have this annoying issue where after playing games for a while (most recently IXION for ~30min) suddenly the FPS drops from 70 or so down to 1-4, and the game sound also distorts.
Quitting the game does not help and the desktop is equally laggy and unusable.
The only thing that fixes is to restart the laptop.

I ran journalctl when it started and it printed this 10x times over 2 seconds

Sep 03 16:28:46 redacted-fw-16 kernel: amdgpu 0000:c5:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data

After those it then goes into this pattern which gets printed ~10 times per second until I restart the computer:

Sep 03 16:28:49 redacted-fw-16 kernel: amdgpu 0000:c5:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
Sep 03 16:28:49 redacted-fw-16 kernel: amdgpu 0000:c5:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data

Any idea anyone?

Also where might this diagnostic data be found that it’s talking about?

I also do get this but am on latest BIOS 3.05 and latest 6.10.9-200.fc40.x86_64. So far I haven’t found any solution.

Reported it here.

https://bugzilla.redhat.com/show_bug.cgi?id=2312366

I found this other relevant thread recently which very much seems like the same issue

3.05 Bios for the Framework 16? Sorry to occupy this Thread, but where did you get that?

Yup, that part looks like my issue. Seems that DMCUB issue affects dGPU users too and is on an officially supported distro!

I haven’t encountered that issue in a while, though I’m on openSUSE Tumbleweed, so maybe a firmware update finally fixed it? The most stressful thing I subject my laptop to is software-rendering DOOM and 7zip benchmarks, although if my body doesn’t fail me editing and rendering 1080p video will soon be added to that!

I have the same issue on NixOS. sudo cat /sys/kernel/debug/dri/1/amdgpu_gpu_recover forces a GPU reset, which solves the issue without a reboot (but crashes some applications).

EDIT: Looks like someone in the other thread independently came up with this same workaround.