Seemingly random framerate drops WITHOUT `Error queueing DMUB command` in journalctl

Basics:

  • Which Linux distro are you using? Fedora Silverblue
  • Which release version? 40.20240920.0
  • Which kernel are you using? 6.10.10-200.fc40.x86_64
  • Which BIOS version are you using? IFGP6.03.03
  • Which Framework Laptop 16 model are you using? AMD Ryzen 7040 Series

Details:

So… I’ve been having this issue since I bought this Framework 16 where, seemingly entirely at random, it will drop down to 1-5 FPS on the desktop, making it basically impossible to do anything except opening the “Run a Command” dialog in GNOME and typing systemctl reboot. Sometimes this happens two to three times in one sitting, and I can’t figure out any concrete reason for it happening.

After perusing this forum for a bit, I’ve found cases of people having an issue with similar symptoms, but when I try to check for the "Error queueing DMUB command " thing that everyone who encounters this issue seemingly has, I get nothing:

$ journalctl --full --system | grep 'Error queueing DMUB command'
$

Oddly, though, I do get the first error that this user mentioned in this thread:

$ journalctl --full --system --since='2000-01-01 00:00:00' | grep 'DMCUB error - collecting diagnostic data'
Sep 23 16:36:59 theseus kernel: amdgpu 0000:c4:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
Sep 23 16:36:59 theseus kernel: amdgpu 0000:c4:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
Sep 23 16:36:59 theseus kernel: amdgpu 0000:c4:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
Sep 23 16:36:59 theseus kernel: amdgpu 0000:c4:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
Sep 23 16:37:00 theseus kernel: amdgpu 0000:c4:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
Sep 23 16:37:00 theseus kernel: amdgpu 0000:c4:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
Sep 23 16:37:00 theseus kernel: amdgpu 0000:c4:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
Sep 23 16:37:00 theseus kernel: amdgpu 0000:c4:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
# [truncated for brevity]
$

I’m really unsure as to what I should do to. I’ve been putting up with this for months in hopes that it’d finally get patched in AMDGPU and there seems to be no progress on that, at least for Fedora Silverblue.

I’m also having this boot time error, but it doesn’t appear to actually be affecting the machine while its running, as far as I can tell…

I’m going to append amdgpu.dcdebugmask=0x210 and amdgpu.gpu_recovery=1 to my kernel arguments with rpm-ostree kargs --append and then make a shell script called unfuck-gpu that runs sudo cat /sys/kernel/debug/dri/1/amdgpu_gpu_recover and pray that works in case I need it because I have an interview tomorrow… :melting_face:

welp, running my ingenious script crashes the entire desktop if I’m sharing my screen, so that’s just lovely

Do you have a dGPU module RX7700S or is this the integrated 780M inside the system? Please provide some system specs.

I have the dGPU module, yes, though which GPU is causing the issue isn’t incredibly clear.

If possible try disabling the GPU module. Some people have reported that the number of crashes went down after swapping out the dGPU module for no dGPU. If you have the Expansion Bay shell, that would be the best option, but otherwise, try disabling the GPU in software.

1 Like

I do have the expansion bay shell, so I’ll give that a shot. Unfortunately I’m unsure of how to reproduce this reliably, so it’ll take a while before it’s clear if it’s fixed the problem or not.

You can also try some of the fixes mentioned in this post. The issue is not the exact same, but the fix might be able to work. Currently this is an issue with the AMD driver, so Framework can’t do much but wait for it to be patched.

Fortunately, between swaping out the GPU for the expansion bay shell and the kernel arguments I added, my laptop managed to make it through my interviews without crashing :tada:

For a long term solution I’ll probably look into building a newer version of the drivers and installing it using an rpm package and then use that until Fedora catches up.

That’s good. If you find anything significant while trying to fix this issue, please post any updates you find.

I’m now running kernel 6.11.0-63.fc41.x86_64, from the next Fedora release. No idea if that’ll solve it, I don’t know which kernel version actually incorporates the supposed fix, but- Here’s how I did it, in case anyone else want to do the same:

  1. Navigate to the kernel package on packages.fedoraproject.org.
  2. Click “Fedora 41” under “Releases” and then click “View Build”. This will take you to Fedora’s build system site.
  3. Under “rpms”, locate your machines architecture.
  4. Download the kernel, kernel-core kernel-modules-core, kernel-modules, and kernel-modules-extra rpms to somewhere on your machine (I used /tmp).
  5. Navigate to that directory in the terminal.
  6. If you’re using Silverblue (or another immutable spin): run rpm-ostree override replace kernel*.rpm. If you’re using standard Fedora, you should be able to use dnf intsall kernel*.rpm, but I’m not nearly as familiar with it. YMMV.

If you’re using Silverblue (or another immutable spin): Make sure to run rpm-ostree override reset kernel kernel-core kernel-modules-core kernel-modules kernel-modules-extra to remove the override on these packages before rebasing to Fedora 41. If you do not do this, you will be indefinitely be using an outdated kernel, and if you don’t do it before you eventually rebase to Fedora 42, then your machine may not boot, as glibc will be built against the Fedora 42 kernel instead of the Fedora 41 kernel. You have been warned.

If you’re using standard Fedora, I’m not sure what dnf will do when you do a major version upgrade. Keep this in mind and tread carefully.