Per AMD themselves, please try this and report back:
If you are still experiencing this. As a test can you try to enable the bios item and report back if this helps or not:
Advanced->iGPU Configuration->UMA_GAME_OPTIMIZED
This will allocate more memory to the GPU.
I discussed this with AMD, and they suspect that these visual artifacts are caused by high memory allocation on the GPU. So providing more memory may help alleviate this issue.
This is actually a great idea and aligns well with what is described in the Fedora bugzilla thread.
In there they speculate that the issue is associated with near full vram usage.
If I can reliably reproduce that with the webgl fish demo I’ll report back.
Oh, maybe that is why I never experienced this. Certainly worth a try for the people affected! The only time I saw it was in a memory heavy game quite a while ago and I toggled that quite a while ago as well.
- Happens regardless of power or battery
- Internal display (have not tried an external display yet)
- have not tested with live usb. (if this would be helpful don’t mind testing this.)
I’ve observed this on sway and i3. Most frequently occurs after waking from sleep if I did not shut the laptop lid myself. Once it occurs in a given boot, I can make it go away by killing sway or i3 (which ever I’m using)and restarting sway or i3. However The longer the laptop has been on the more frequently this occurs. around a week or so it becomes so frequent that I just reboot. Rebooting keeps it at bay for a while. (some times a day or two some times only a few hours) origionaly thought this was a wayland issue untill I tried switching to i3 to get rid of it (which failed, however the only time I got it to happen on x11 was by messing with the refresh rates or having it happen in sway and switching to i3 before killing sway). This happens at all the scales I’ve tried fractional or not. Though it seem to happen less frequently at 1.
It was alway in the back of my head to check the framework forums but I feel silly for not checking sooner.
I skimed this thread and I saw:
- update the bios to 3.03 (but someone reported that made it worse)
- Adding the amdgpu.sg_display=0 kernel param (is that an x11 specific?)
Which is the correct fix?
Also do you still need people to file bug reports? And who do we file with, redhat, fedora, or amd? all have been mentioned in this thread.
All,
Please try what is suggested here and report back.
We’re actively tracking this and need your A/B testing to confirm if this is helping.
Trying to understand this…the A/B testing is to see if this “may help alleviate this issue.”…and not necessary a final fix? Correct?
Yes. On/Off, UMA_GAME_OPTIMIZED in use or not.
UMA_GAME_OPTIMIZED is the proper workaround for me. (as it increases vram)
But I can still reproduce it when I max out the vram otherwise: (checked with “radeontop”)
// (write 2g from GTT to VRAM 1000 times)
amdgpu_stress -b g 2g -b v 2g -c 1 2 2g 1000
When I launch a youtube video in fullscreen with firefox, or I attach my external Monitor (4k120Hz hdmi2.1) it looks like all the pictures in this thread.
lots of:
amdgpu 0000:c1:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0005 address=0xfffb1580000 flags=0x0000]
I am using Arch and kde plasma.
If you Have 32GB>= of RAM the UMA Game optimised should really be set. Ideally this should be allocatable tunable in the BIOS. Some applications balk at the 4GB and refuse to load which leads to dodgy hacks such as overiding the HII database i.e : GitHub - DavidS95/Smokeless_UMAF
Frame.work can you make this a user selectable in the next bios release please 512GB 4GB 8GB options.
We can only set the optimized option on and off. As this is only a binary flag we can send to AMD firmware.
I also wish we had better control of this as well to provide you with more granular control.
Just as a note, if the GPU runs out of RAM, it will continue allocate memory from system memory.
Initially with the bios 3.02 I have been seeing graphical corruption but looking at the bios settings, I have enabled the UMA Game Optimized as well. I have not seen the graphical corruption, nor screen flickering. But I have another issue, which is specifically when I connect my 4K display through the Display Port module and the laptop screens goes dark after inactivity, on the resumption the laptop hangs/becomes extremely slow.
This issue doesn’t happen with the HDMI cable and my 4K monitor, neither with the Display port and another 1440p monitor I have.
Switching to the bios 3.03, same issues are seen. Graphical corruption without the UMA Game Optimized, system hang on the 4K display port on system resumption with the UMA Game Optimized on. I managed to disconnect the display and was patient enough to dump the dmesg for bios 3.03. (I have 7840U variant kitted with 96GB (2x48 GB) crucial DRAM sticks, running fedora 39 with gnome. kernel: 6.5.8-300.fc39.x86_64) Here are some of the messages after greping for amdgpu:
[ 3.168357] [drm] amdgpu kernel modesetting enabled.
[ 3.175448] amdgpu: CRAT table disabled by module option
[ 3.175454] amdgpu: Virtual CRAT table created for CPU
[ 3.175485] amdgpu: Topology: Add CPU node
[ 3.175631] amdgpu 0000:c1:00.0: enabling device (0006 -> 0007)
[ 3.180188] amdgpu 0000:c1:00.0: amdgpu: Fetched VBIOS from VFCT
[ 3.180189] amdgpu: ATOM BIOS: 113-PHXGENERIC-001
[ 3.212899] amdgpu 0000:c1:00.0: [drm:jpeg_v4_0_early_init [amdgpu]] JPEG decode is enabled in VM mode
[ 3.237832] amdgpu 0000:c1:00.0: vgaarb: deactivate vga console
[ 3.237839] amdgpu 0000:c1:00.0: amdgpu: Trusted Memory Zone (TMZ) feature enabled
[ 3.237945] amdgpu 0000:c1:00.0: amdgpu: VRAM: 4096M 0x0000008000000000 - 0x00000080FFFFFFFF (4096M used)
[ 3.237947] amdgpu 0000:c1:00.0: amdgpu: GART: 512M 0x0000000000000000 - 0x000000001FFFFFFF
[ 3.237948] amdgpu 0000:c1:00.0: amdgpu: AGP: 267894784M 0x0000008400000000 - 0x0000FFFFFFFFFFFF
[ 3.238208] [drm] amdgpu: 4096M of VRAM memory ready
[ 3.238211] [drm] amdgpu: 46098M of GTT memory ready.
[ 3.241589] amdgpu 0000:c1:00.0: amdgpu: Will use PSP to load VCN firmware
[ 3.783043] amdgpu 0000:c1:00.0: amdgpu: RAS: optional ras ta ucode is not available
[ 3.791355] amdgpu 0000:c1:00.0: amdgpu: RAP: optional rap ta ucode is not available
[ 3.791359] amdgpu 0000:c1:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[ 3.812385] amdgpu 0000:c1:00.0: amdgpu: SMU is initialized successfully!
[ 3.904880] amdgpu 0000:c1:00.0: [drm:jpeg_v4_0_hw_init [amdgpu]] JPEG decode initialized successfully.
[ 3.923140] amdgpu: HMM registered 4096MB device memory
[ 3.924242] kfd kfd: amdgpu: Allocated 3969056 bytes on gart
[ 3.924247] kfd kfd: amdgpu: Total number of KFD nodes to be created: 1
[ 3.924502] amdgpu: Virtual CRAT table created for GPU
[ 3.925052] amdgpu: Topology: Add dGPU node [0x15bf:0x1002]
[ 3.925054] kfd kfd: amdgpu: added device 1002:15bf
[ 3.925069] amdgpu 0000:c1:00.0: amdgpu: SE 1, SH per SE 2, CU per SH 6, active_cu_number 12
[ 3.925214] amdgpu 0000:c1:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[ 3.925216] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[ 3.925217] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[ 3.925219] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
[ 3.925220] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
[ 3.925221] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
[ 3.925222] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
[ 3.925223] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
[ 3.925224] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
[ 3.925225] amdgpu 0000:c1:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[ 3.925226] amdgpu 0000:c1:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
[ 3.925227] amdgpu 0000:c1:00.0: amdgpu: ring jpeg_dec uses VM inv eng 1 on hub 8
[ 3.925229] amdgpu 0000:c1:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0
[ 3.931466] [drm] Initialized amdgpu 3.54.0 20150101 for 0000:c1:00.0 on minor 1
[ 3.936982] fbcon: amdgpudrmfb (fb0) is primary device
[ 3.936986] amdgpu 0000:c1:00.0: [drm] fb0: amdgpudrmfb frame buffer device
[ 20.345718] snd_hda_intel 0000:c1:00.1: bound 0000:c1:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
[ 1136.599129] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=14
[ 1136.599367] [drm:amdgpu_mes_reg_write_reg_wait [amdgpu]] *ERROR* failed to reg_write_reg_wait
[ 1136.601708] amdgpu 0000:c1:00.0: amdgpu: SMU is resuming...
[ 1136.605286] amdgpu 0000:c1:00.0: amdgpu: SMU is resumed successfully!
[ 1136.711339] amdgpu 0000:c1:00.0: [drm:jpeg_v4_0_hw_init [amdgpu]] JPEG decode initialized successfully.
[ 1136.711598] amdgpu 0000:c1:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[ 1136.711601] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[ 1136.711603] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[ 1136.711604] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
[ 1136.711605] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
[ 1136.711606] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
[ 1136.711607] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
[ 1136.711608] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
[ 1136.711610] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
[ 1136.711611] amdgpu 0000:c1:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[ 1136.711612] amdgpu 0000:c1:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
[ 1136.711613] amdgpu 0000:c1:00.0: amdgpu: ring jpeg_dec uses VM inv eng 1 on hub 8
[ 1136.711614] amdgpu 0000:c1:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0
[ 3322.975833] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=14
[ 3322.976074] [drm:amdgpu_mes_reg_write_reg_wait [amdgpu]] *ERROR* failed to reg_write_reg_wait
[ 3322.978530] amdgpu 0000:c1:00.0: amdgpu: SMU is resuming...
[ 3322.981620] amdgpu 0000:c1:00.0: amdgpu: SMU is resumed successfully!
[ 3323.355351] amdgpu 0000:c1:00.0: [drm:jpeg_v4_0_hw_init [amdgpu]] JPEG decode initialized successfully.
[ 3323.355605] amdgpu 0000:c1:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[ 3323.355608] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[ 3323.355610] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[ 3323.355611] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
[ 3323.355612] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
[ 3323.355613] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
[ 3323.355614] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
[ 3323.355615] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
[ 3323.355616] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
[ 3323.355617] amdgpu 0000:c1:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[ 3323.355619] amdgpu 0000:c1:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
[ 3323.355620] amdgpu 0000:c1:00.0: amdgpu: ring jpeg_dec uses VM inv eng 1 on hub 8
[ 3323.355621] amdgpu 0000:c1:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0
[ 3381.540086] [drm:detect_link_and_local_sink [amdgpu]] *ERROR* No EDID read.
[ 3409.925738] WARNING: CPU: 5 PID: 536 at drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dp_capability.c:1527 dp_retrieve_lttpr_cap+0x16f/0x1a0 [amdgpu]
[ 3409.926030] hid_sensor_iio_common snd_timer irqbypass snd_acp_config industrialio_triggered_buffer kfifo_buf snd_soc_acpi rapl snd thunderbolt industrialio rfkill soundcore snd_pci_acp3x pcspkr i2c_piix4 k10temp amd_pmf joydev amd_pmc platform_profile loop zram dm_crypt amdgpu i2c_algo_bit drm_ttm_helper ttm drm_suballoc_helper amdxcp iommu_v2 crct10dif_pclmul drm_buddy nvme crc32_pclmul gpu_sched crc32c_intel polyval_clmulni polyval_generic nvme_core drm_display_helper video ucsi_acpi ghash_clmulni_intel hid_sensor_hub hid_multitouch sha512_ssse3 typec_ucsi ccp sp5100_tco cec typec nvme_common wmi i2c_hid_acpi i2c_hid serio_raw ip6_tables ip_tables fuse
[ 3409.926084] Workqueue: events_highpri dm_irq_work_func [amdgpu]
[ 3409.926264] RIP: 0010:dp_retrieve_lttpr_cap+0x16f/0x1a0 [amdgpu]
[ 3409.926447] ? dp_retrieve_lttpr_cap+0x16f/0x1a0 [amdgpu]
[ 3409.926608] ? dp_retrieve_lttpr_cap+0x16f/0x1a0 [amdgpu]
[ 3409.926761] ? dp_retrieve_lttpr_cap+0x16f/0x1a0 [amdgpu]
[ 3409.926890] ? dp_retrieve_lttpr_cap+0x114/0x1a0 [amdgpu]
[ 3409.927020] retrieve_link_cap+0x7d/0xb90 [amdgpu]
[ 3409.927155] ? dp_is_sink_present+0xbc/0x120 [amdgpu]
[ 3409.927284] detect_link_and_local_sink+0xb24/0xfc0 [amdgpu]
[ 3409.927446] link_detect+0x3a/0x480 [amdgpu]
[ 3409.927583] ? dal_gpio_destroy_irq+0x25/0x40 [amdgpu]
[ 3409.927727] ? query_hpd_status+0x6e/0xa0 [amdgpu]
[ 3409.927889] handle_hpd_irq_helper+0xf9/0x170 [amdgpu]
[ 3410.109821] [drm:retrieve_link_cap [amdgpu]] *ERROR* retrieve_link_cap: Read receiver caps dpcd data failed.
[ 3413.573401] [drm:detect_link_and_local_sink [amdgpu]] *ERROR* No EDID read.
[ 3588.187339] [drm:detect_link_and_local_sink [amdgpu]] *ERROR* No EDID read.
[ 3616.713695] [drm:amdgpu_dm_process_dmub_aux_transfer_sync [amdgpu]] *ERROR* wait_for_completion_timeout timeout!
[ 3624.252516] hid_sensor_iio_common snd_timer irqbypass snd_acp_config industrialio_triggered_buffer kfifo_buf snd_soc_acpi rapl snd thunderbolt industrialio rfkill soundcore snd_pci_acp3x pcspkr i2c_piix4 k10temp amd_pmf joydev amd_pmc platform_profile loop zram dm_crypt amdgpu i2c_algo_bit drm_ttm_helper ttm drm_suballoc_helper amdxcp iommu_v2 crct10dif_pclmul drm_buddy nvme crc32_pclmul gpu_sched crc32c_intel polyval_clmulni polyval_generic nvme_core drm_display_helper video ucsi_acpi ghash_clmulni_intel hid_sensor_hub hid_multitouch sha512_ssse3 typec_ucsi ccp sp5100_tco cec typec nvme_common wmi i2c_hid_acpi i2c_hid serio_raw ip6_tables ip_tables fuse
[ 3624.252560] Workqueue: events_highpri dm_irq_work_func [amdgpu]
[ 3624.252803] dmub_srv_wait_for_idle+0x40/0x90 [amdgpu]
[ 3624.252964] dc_dmub_srv_cmd_run_list+0xed/0x1b0 [amdgpu]
[ 3624.253110] dcn31_link_encoder_is_in_alt_mode+0xae/0x100 [amdgpu]
[ 3624.253259] detect_link_and_local_sink+0xc02/0xfc0 [amdgpu]
[ 3624.253427] ? dm_read_reg_func+0x38/0xb0 [amdgpu]
[ 3624.253597] link_detect+0x3a/0x480 [amdgpu]
[ 3624.253757] ? query_hpd_status+0x6e/0xa0 [amdgpu]
[ 3624.253904] handle_hpd_irq_helper+0xf9/0x170 [amdgpu]
[ 3625.113178] [drm:dc_dmub_srv_cmd_run_list [amdgpu]] *ERROR* Error queueing DMUB command: status=2
...
[ 3647.240539] [drm:dc_dmub_srv_cmd_run_list [amdgpu]] *ERROR* Error queueing DMUB command: status=2
[ 3647.433623] [drm:amdgpu_dm_process_dmub_aux_transfer_sync [amdgpu]] *ERROR* wait_for_completion_timeout timeout!
[ 3647.481713] [drm:dc_dmub_srv_cmd_run_list [amdgpu]] *ERROR* Error queueing DMUB command: status=2
...
@Kieran_Levin ; I haven’t tried Smokeless UMAF on the FW13 ; given the HII Database seems to be exported by default tho as part of AGESA on most BIOS’s (I have one 570x mainboard from ASUS that it’s a tuneable ), I would expect it to ‘somewhat work’. However as there is no mention of Phoenix support have been Leary to attempt; It definitely is helpful on the 5600G I use as my HTPC box tho. Yeah appreciate the Static reserved portion is expandable into 32GB of System RAM on demand but - there are some things which only check the reserved portion for reporting.
So my curiosity got the better of me, Smokeless-UMAF does indeed work with the FW13 AMD however as you’ve mentioned the UMA Buffer data structure only has Auto and Enhanced exposed.
May not be related (ignore this if that’s the case): What’s this 8GB VRAM thing on the Asus Ally?
For that processor / BIOS, they seem to be able to choose between Auto / 3 / 4 / 6 / 8 GB.
https://www.reddit.com/r/ROGAlly/comments/149h6hp/optimal_vram_allocation_2_gb_default_4_gb_8_gb/
Is Asus getting some special TLC from AMD with additional BIOS / flag / setting support?
I can reliably reproduce white graphical flickering 100% of the time by clicking the desktop session picker in SDDM before logging in on Fedora 39 KDE (kernel 6.5.9-300.fc39.x86_64) with all packages updated. While SDDM’s desktop session picker’s dropdown (where you choose between desktop environments and X11/Wayland) is open, all of the screen except the dropdown itself flickers white very rapidly. Setting UMA_GAME_OPTIMIZED did not change the behavior at all.
Welcome to the community @Gawdl3y.
Latest kernel on Fedora 39 is 6.5.10-300 (at least as of today).
This is Fedora Workstation, GNOME, correct? Additionally, is this happening on the internal display or are there external displays attached?
SDDM is Plasma/KDE’s display manager so I guess not.
Thanks! I’m running Fedora Workstation KDE, and the issue I’m describing is on the SDDM login screen. This is on the internal display, and there are no other displays connected. At the original time of my post, 6.5.9 was the latest kernel - I upgraded to 6.5.10 yesterday when it became available and confirmed the issue is still present with it as well.
I’m having this problem here using two displays coming back from wakeup with fedora 39, AMD 3.03 bios, will try the kernel params mentioned