[TRACKING] Graphical corruption in Fedora 39 (AMD 3.03 BIOS)

i don’t wanna decry it, but i think my problem with grimshot and slup are gone <3

Well I’m back. It’s been a while but I have had graphical corruption issues again. It started a few weeks ago, but I’ve been to busy to post. When I posted the first time the specific corruption I had looked like what mopac posted in October (white blocky artifacts). In order to get rid of them I would have to restart my computer. They occured more often when the computer had been powered on for a few days. amdgpu.sg_display=0 ultimatly fixed it and I haven’t had any issues for months.

This new graphical corruption looks more like what tokanda posted (colorful and with a seemingly more random distribution with minimal banding). It does not seem to matter how long my computers been up. And I don’t experience it often. I have only seen it happen after suspend or after my lock screen appears due to idling. Also this version of corruption has gone away without me restarting the computer just by letting it sleep after idling (but that behavior is not consistant). It also doesn’t happen very often (I’ve seen it probably 2 or three times.) and I have no idea how to reproduce it.

@Kenneth_L_Rountree I presume you are running an Linux Kernel earlier than 6.8. A lot of fixes for AMDGPU have gone in for 6.8 and i’m running without “amdgpu.sg_display=0” for a few weeks now and haven’t had any issues in this regard anymore…

Yes 6.7.9 dnf says I’m up to date. I thought dnf updated the kernel automaticly. I guess I’ll look into updating in manualy. Thanks.

@Kenneth_L_Rountree Well, i’m on NixOS and have been running release candidate kernels for the 6.8 series, but it is released now and i imagine, it should appear pretty quickly in the fedora repositories… I’m running 6.8.1 now to be precise. Perhaps also of interest for other users: It appears that there is no need to run with “rtc_cmos.use_acpi_alarm=1” anymore, since, iirc, this has become the new default for AMD systems also, thanks to @Mario_Limonciello. Getting there, IMHO…

1 Like

I was encountering display strobing similar to what is described here, in addition to GPU hangs when playing 3D games, on Debian 12 (thread on the latter issue), even after backporting the kernel and firmware, and I suspect it’s a related issue. Working around it, for me, required both updating Mesa to a newer version than Debian provided (I eventually just moved to Debian testing), and setting the kernel parameter amdgpu.sg_display=0. I also turned on UMA_GAME_OPTIMIZED in BIOS settings at around the same time I set that kernel parameter.

Thanks Mario,
I’ve pulled Debian’s Trixie firmware packages and the git kernel repo for firmware and copied it in on my current kernel, updated initram and rebooted.

Let’s see if it this stabilises things.

You’ll see 6.8 in fedora 40, since thats the active branch most 6.8 kernel builds will be there, FC 40 beta was postponed to 3/26.
6.8.0-63.fc40.1.x86_64

I still manage to trigger it on each suspend with UMA_AUTO and amdgpu.sg_display=0 unset, I have also been able to reliably trigger it by changing gnomes scaling offset.

This is great news! I’ll remove it and try the suspend script again and see if it passes.

On Rawhide (40) and 6.8 kernels I have been able to trigger it with sg disabled and UMA pumped to 4GB when I am using HDMI external display.

I was running a tradebooth screen off my Framework 13 a couple of weeks ago and managed to get it to trigger with disconnects/reconnects of the external screen. The eDP panel remained unaffected but I got the familar whiteout/banding on the external display after a couple of plug/replug events.

2 Likes

I also still occasionally observe the white-screen flashing on Arch, which is currently on kernel 6.8.1, with UMA_GAME_OPTIMIZED enabled.

I need to do more testing to find out what, exactly, triggers it, but I’m leaning towards external monitors, XWayland, and/or GPU intensive applications (such as MS Windows games, which perhaps not coincidentally also tend to use XWayland). It would be interesting to find out if I can still reproduce the issue with Wayland native apps.

Also pointing out that switching back-and-forth between TTYs (ctrl+alt+F2/F3/F… in Gnome; or sudo chvt 2 on the terminal, where chvt 1 should switch you back) will generally restore stuff to normal. sudo systemctl soft-reboot also tends to work, but that tends to cause my wifi device to be unavailable.

4 Likes

uhhh i need to try this, thanks :3

I just briefly observed white flickering in a fullscreen Celluloid instance (a Wayland native app), but the fllickering went away on its own.

dmesg -l err output:

[83508.978430] amdgpu 0000:c1:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0005 address=0xffffd800000 flags=0x0000]
[83508.978489] amdgpu 0000:c1:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0005 address=0xffffd801000 flags=0x0000]
[83508.978525] amdgpu 0000:c1:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0005 address=0xffffd802000 flags=0x0000]
[83508.978532] amdgpu 0000:c1:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0005 address=0xffffd803000 flags=0x0000]
[83508.978538] amdgpu 0000:c1:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0005 address=0xffffd804000 flags=0x0000]
[83508.978544] amdgpu 0000:c1:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0005 address=0xffffd805000 flags=0x0000]
[83508.978550] amdgpu 0000:c1:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0005 address=0xffffd806000 flags=0x0000]
[83508.978556] amdgpu 0000:c1:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0005 address=0xffffd807000 flags=0x0000]
[83508.978563] amdgpu 0000:c1:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0005 address=0xffffd820000 flags=0x0000]
[83508.978569] amdgpu 0000:c1:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0005 address=0xffffd82c000 flags=0x0000]

I happened to trigger that graphic corruption today, after I turned off UMA_GAME_OPTIMIZED.

However, switching back and forth TTY doesn’t work for me :frowning: the key combo I used were ctrl+alt+F3 then ctrl+alt+F2.

Also I’m on Kernel 6.6.10 Pop!_OS, does upgrading to the latest kernel help?

Unfortunately I can still experience this on 6.8.1. It actually happens much more often on the newer kernels unless you enable UMA_GAME_OPTIMIZED.

However, overall power consumption is much lower on 6.8.1 so I trade the extra ram for the battery life and continue to test each new kernel.

1 Like

may I ask what distro are you using?

I bet I’d do that same - enable UMA_GAME_OPTIMIZED

Until now, I don’t have an effective way to trigger the white screen, i.e. I don’t know what will trigger Lol

mhh sadly this doen’t work on arch with sway ^^v

I’m running the Fedora 40 branch (eg: fc39 dnf system-upgrade to fc40).

1 Like

Hello,

This thread is becoming way too long, is there a place that give the current status and the possible workaround ? in the KB for instance ?
Just to say also under fedora 40 the graphical corruption is get worse.

1 Like

I’m running the Fedora 40 branch (eg: fc39 dnf system-upgrade to fc40).

Don’t do this as a novice! The beta isn’t even out yet.

Just install the kernel manually.

There is nothing that solves everything, but there are workarounds.

  • Kernel >= 6.8 has many improvements, but still has some white screen issues (very weird that it’s worse for you)
  • Change the BIOS settings from Auto to UMA_Game_Optimized.

This shouldn’t be necessary anymore in 6.8, but if you encounter issues feel free to set it:

  • Set kernel arg amdgpu.sg_display=0
1 Like