Try
amdgpu.graphics_sg=0 amdgpu.sg_display=0
Try
amdgpu.graphics_sg=0 amdgpu.sg_display=0
Can you please try to downgrade the GPU firmware from January to December and see if this goes away? Don’t forget to rebuild initrd if you put it in there.
Tried this, sadly does not work. ):
What do you mean by downgrading the GPU firmware?
There is a file dcn_3.1.4_dmcub.bin that is stored in /lib/firmware/amdgpu.
An update was released in January that I am suspecting is causing this problem. If you can revert it to the version before that it could confirm.
See if the version before January’s helps.
Hey, I’ve got the exact same issue. Framework 13 AMD, 64gb with arch and i3.
How would I downgrade the driver? Is it just a case of replacing that file with an older version then rebooting?
Download this version
And replace it in your filesystem and rebuild the initrd and reboot.
I faced this on 6.7.0 when I got my laptop last week but haven’t faced this after downgrading to linux-lts
which is 6.6.x.
I also use GNOME on Wayland + Arch.
Would you mind bisecting it?
Sure, I will spend some time tomorrow on this and get back.
Screen corruption similar to the OP on kernel 6.7.0 (linux-6.7.arch3-1-x86_64.pkg.tar.zst
) : https://i.imgur.com/6nBuB0L.jpg
Firmware package: linux-firmware-20240115.9b6d0b08-1-any.pkg.tar.zst
DE: GNOME on Wayland
Distro: Arch
Now I have downgraded to linux-lts
which is 6.6.13. Here I expect to experience white screen issues randomly throughout the day.
I will share a picture when I do. I was able to solve the white screen problem by using amdgpu.sg_display=0
. I posted my experience here a week ago: [RESPONDED] FW13 AMD 7840U Arch Graphics Output Corruption - #2 by bullza
Is there any way we can fix this issue without disabling scatter/gather?
There are two issues here - the blocky corruption and the white screen.
The blocky corruption seems to be either a Linux firmware regression or 6.7 regression (unclear which right now).
The white screen issue with scatter gather doesn’t have a root cause.
I can’t reproduce either of these myself.
After ~3 hours of use, I now faced the white screen problem on linux-lts
v6.6.13 : https://i.imgur.com/bFNGhoc.jpg
I have now added amdgpu.sg_display=0
back again and I expect things to be stable.
You might want to use an external monitor (via USB-C). I forgot to mention that I use an external monitor as well. Using a web browser + maximized window speeds up occurrence of the white screen problem. I have tried both Firefox (manually enabled VA-API in about:config) and Vivaldi (Chromium-based, default settings).
Yeah I’ve tried multiple external monitors connected to a dock as well as VRAM stress workloads running on the system. I can’t reproduce it.
Please let me know if there is something else I can do to help you.
I also read somewhere that using UMA_Game_Optimized
in UEFI reduces the probability of this occurring. I have it disabled (i.e., I am using the default setting).
/sys/bus/pci/drivers/amdgpu/*/mem_info*
, a scan from GC, and SDMA registers using UMR, and a kernel log? You can put these on a bug report on AMD’s Gitlab please. Right now I don’t know if it’s a mesa bug, a platform firmware bug, or a graphics bug. Until AMD can reproduce it it’s just guesses.I will start with 2 as it won’t interfere much with my work hours.
A bit off-topic, I only get the following message when I run: umr --logscan -O bits,follow,empty_log
Cannot open DRI name under debugfs: No such file or directory
ERROR: amdgpu.ko is loaded but /sys/kernel/debug/dri/0/name is not found
[WARNING]: Unknown ASIC [amd15bf] should be added to pci.did to get proper name
sh: line 1: /sys/kernel/debug/tracing/events/amdgpu/amdgpu_mm_wreg/enable: No such file or directory
sh: line 1: /sys/kernel/debug/tracing/events/amdgpu/amdgpu_mm_rreg/enable: No such file or directory
[ERROR]: Could not enable mm tracers
I think I need to enable some debug flags or something.
Sounds like debugfs isn’t enabled in your kernel.
Nothing to add here except to point out everything Mario indicated and requested would be helpful for troubleshooting.