Amdgpu Error queueing DMUB command: status=2 when waking from suspend

There’s a newer version of the amdgpu dmcub dcn 314 linux firmware stating in the change logs

Fixes lock problem

I’m using Arch Linux ( Linux 6.10.3 ), KDE Plasma 6.1.3, FW16 7840HS, BIOS 3.04 beta

I updated to it two days ago and it’s not happened yet but I don’t know if that changelog entry is referring to the issue we’re experiencing

Just a thought. This is not a FW specific problem. Other AMD laptops have similar problems. e.g. Laptop: HP Victus 16-s0xxx

for which one can probably find a bug fix by now.

Right. I posted a patch that should help. Anyone who can reliably reproduce should try it.

@Mario_Limonciello , @James3 … How do we make this reliably reproduced? For me it happens once in a while (always at the oddest time). If there was a way to force it to happen, I could help fix it.

Another thought… Most people don’t want to accept random patches to their kernel. Any chance the patches will make it into the mainline/Ubuntu/Fedora kernel?

@Scott_Savarese
At a guess I think the patch would make it to mainline because of where Mario works.

It will eventually make it into mainline kernel yes.

The suspected reason for the problem is described in the link I shared.

One thing you can try to do to reproduce it is change the brightness really rapidly using brightness slider in GNOME while PSR is enabled. This should stress DCN.

1 Like

Framework seems to be able to auto-adjust brightness periodically… Sometimes it happens rapidly. Could this be a trigger? (Personally I’ve never changed the brightness using the slider)

There are multiple ways that could stress DCN, so yeah that sounds plausible at least.

The issue happened to me just now for the first time, and I had noticed the auto brightness changing rapidly just before it happened. The sun is going in and out of the clouds. Usually I use my laptop in the evenings, so I never even noticed the auto-brightness before today.

I just turned off Automatic Screen Brightness. It was driving me a little nuts, anyway.

I think I’ve had this multiple times, with my computer semi freezing and forcing reboot. Just checked system log after it happened. Ubuntu 22.04.4

[ 2690.053816] amdgpu 0000:c5:00.0: [drm] ERROR Error queueing DMUB command: status=2
[ 2690.300410] amdgpu 0000:c5:00.0: [drm] ERROR Error queueing DMUB command: status=2
[ 2690.547139] amdgpu 0000:c5:00.0: [drm] ERROR Error queueing DMUB command: status=2
[ 2690.794774] amdgpu 0000:c5:00.0: [drm] ERROR Error queueing DMUB command: status=2
[ 2691.041153] amdgpu 0000:c5:00.0: [drm] ERROR Error queueing DMUB command: status=2
[ 2691.287771] amdgpu 0000:c5:00.0: [drm] ERROR Error queueing DMUB command: status=2
[ 2691.535920] amdgpu 0000:c5:00.0: [drm] ERROR Error queueing DMUB command: status=2
[ 2691.782753] amdgpu 0000:c5:00.0: [drm] ERROR Error queueing DMUB command: status=2
[ 2692.029408] amdgpu 0000:c5:00.0: [drm] ERROR Error queueing DMUB command: status=2
[ 2692.277463] amdgpu 0000:c5:00.0: [drm] ERROR Error queueing DMUB command: status=2

I believe my computer shipped with the up to date bios so never updated.

Just saw this bug again with what I believe is a fully updated FW16, but the runtime solution worked. Thank you!

2 Likes

I also have the problem, but what’s crazy is that my second screen who is directly plugged into the GPU work fine (just some latency when the cursor goes back to the intel screen).
Well, i made my second fw_ alias to try the sudo cat /sys/kernel/debug/dri/1/amdgpu_gpu_recover thanks

Can anyone that doesn’t have auto-brightness turned on anymore confirm that it “solves” the problem?
Or does it still happen?

I’ve written some code to auto adjust the brightness while also regularly updating the amdgpu firmware ( I’m now on this commit )

Over the last 4-5 updates without changing my code there has been a significant reduction on the PSR issue happening, but still happens.

I’ve limited it too one write per millisecond to the brightness file ( I’m not sure if this helps )

Most of the time when the PSR issue happens with me now is usually when I’m just doing regular browsing in Firefox or when I’m using an application that’s using Xwayland, for me this was with Jetbrains RustRover which I later configured it to use wayland and issue not happened since.

but it’s not often it happens for me now… 1-3 times a month

PR - Panel Replay
PSR - Panel Self Refresh

Thinking about what I’ve said recently.

I’ve purposely started using more software that uses Xwayland on KDE Plasma Wayland and I’ve been frequently getting more PSR hangs ( per day and in some cases multiple times per day, yesterday was three times and this morning twice )

I’ve also noticed when using Firefox ( Wayland or Xwayland ) on KDE Plasma Wayland, most of the hangs seem to happen on website that have video’s playing ( youtube, reddit, twitch for example )

I’ve also started using Hexchat ( IRC client ) which is used via Xwayland ( it doesn’t support Wayland yet ) and again, more PSR hangs

I’ve stopped using all that software using Xwayland ( except Firefox but not using those websites I’ve mentioned earlier ) and no PSR hangs

I’ve also noticed an increase in PSR hangs since Linux 6.10.10 on Gentoo ( happened twice while typing this post )

When the PSR hang happens, I logout KDE Plasma and onto my TTY, I then execute

sudo cat /sys/kernel/debug/dri/1/amdgpu_gpu_recover

but the screen is just blank ( it’s like this for over a minute ) and so I do a cold reboot

I think found some triggers ( I’ve not read around yet for what others have said for triggers ) but it’s still not enough to narrow down what the issue is in my opinion.

Linux 6.11.0 is released and I’ve not compiled/installed it as gentoo ( sys-kernel/gentoo-kernel ) hasn’t released it yet. I’ve noticed numerous changes to amdgpu related to PSR and PR. Maybe those changes have fixed the PSR hanging issues and I think PR is being enabled in Linux 6.12

Is there any ETA on getting this fixed?

I would consider this sort of issue to be of high priority, it’s not good for a latop to become unusuable like this. What if this happens in the middle of a critical business presentation or a remote job interview?

@Vadim_Peretokin , this is an issue with the AMD Linux driver provided by the kernel. It is not provided by Framework. You have a few choices… Supposedly the newer AMD drivers have fixes to alleviate this issue. So, you can wait for the kernel to get the latest driver version. Depending on the Linux flavor you use this can take a long time. Or another option is to roll your own kernel and bring in the latest AMD driver. But there isn’t much for Framework to do here.

Personally, I removed the external GPU on my Framework and that seemed to help. People also have turned off auto brightness settings which helps alot. There is also a way to mitigate the problem by resetting the driver when the issue pops up.

as a mitigation until a fix comes later, you can add to the kernel command line

amdgpu.dcdebugmask=0x210

which disables PSR and PSR-SU within amdgpu

here ( linux 6.11.0 ) between lines 254-267 are a list of other flags you can enable/disable

1 Like

Thanks! What does one lose in practice by disabling it?

I’ve noticed power usage is slightly higher ( 0.8-1.2W~ )