Amdgpu Error queueing DMUB command: status=2 when waking from suspend

Still happening pretty regularly for me: Ubuntu 24.04 LTS on a FW16. Not “when waking from suspend”, though: randomly when working. And I don’t have the dGPU expansion card, this is just with built-in graphics.

If you can reliably reproduce, can you please apply these 2 patches and confirm it helps?

https://lore.kernel.org/dri-devel/20240708202907.383917-1-hamza.mahfooz@amd.com/

They’re for this type of problem (you can read more in the thread)

Is that the correct link? It takes me to a discussion of power optimization and dynamic vblank delay.

Yeah it is.

Things have seen to have gotten better for me. I removed the dGPU expansion card and I haven’t had this problem in a few weeks.

I experienced this sudden iGPU performance drop (1FPS) today on my AMD Framework 13 running Fedora 40 (Gnome Wayland), kernel 6.9.5, amd-gpu-firmware 20240709-1.fc40.

This was also spammed in my system journal:

amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2

Here’s a runtime solution :tada: (no reboot required) which worked for me:

$ sudo cat /sys/kernel/debug/dri/1/amdgpu_gpu_recover

The screen then goes black for a second and comes back with restored GPU performance.

Hope this helps!

5 Likes

Besides the runtime solution, it looks like there is a kernel option you can give it to automatically restart the driver if some issues are found.

Hopefully Framework can fix the driver issues.

2 weeks+ since the last incident. No change in behavior and software (beside updates).

The issue seems patched in Fedora.

Edit: Today (29 Jul) it happened again

The same just happened out of nowhere (no suspending) on my brand-new Framework Laptop 13 (AMD Ryzen 7 7840U w/ Radeon 780M) running Ubuntu 24.04 LTS 64-bit (Linux 6.8.0-39-generic on Wayland).

1 Like

I am on Ubuntu 22.04 kernel 6.0.5-1025-oem
I’ve received a mail from FW concerning a new bios, haven’t updated yet.
I am using my fw16 for months never had this issue.
Today I had to restart my computer to get it solved, it worked for a minute and started again to crawl.
Your solution help to get it working for now
Thanks

I have been having this issue was well on my Framework 16 on Fedora 40.

Same here! Just had it happen. Error message was the exact same

There’s a newer version of the amdgpu dmcub dcn 314 linux firmware stating in the change logs

Fixes lock problem

I’m using Arch Linux ( Linux 6.10.3 ), KDE Plasma 6.1.3, FW16 7840HS, BIOS 3.04 beta

I updated to it two days ago and it’s not happened yet but I don’t know if that changelog entry is referring to the issue we’re experiencing

Just a thought. This is not a FW specific problem. Other AMD laptops have similar problems. e.g. Laptop: HP Victus 16-s0xxx

for which one can probably find a bug fix by now.

Right. I posted a patch that should help. Anyone who can reliably reproduce should try it.

@Mario_Limonciello , @James3 … How do we make this reliably reproduced? For me it happens once in a while (always at the oddest time). If there was a way to force it to happen, I could help fix it.

Another thought… Most people don’t want to accept random patches to their kernel. Any chance the patches will make it into the mainline/Ubuntu/Fedora kernel?

@Scott_Savarese
At a guess I think the patch would make it to mainline because of where Mario works.

It will eventually make it into mainline kernel yes.

The suspected reason for the problem is described in the link I shared.

One thing you can try to do to reproduce it is change the brightness really rapidly using brightness slider in GNOME while PSR is enabled. This should stress DCN.

1 Like

Framework seems to be able to auto-adjust brightness periodically… Sometimes it happens rapidly. Could this be a trigger? (Personally I’ve never changed the brightness using the slider)

There are multiple ways that could stress DCN, so yeah that sounds plausible at least.