Fedora 40 on the Framework Laptop 13

Just a reminder, if you feel you have come across a potential bug, please file a bug report with the Fedora team in addition to mentioning it here. Thanks!

2 Likes

@kholdstare @ehsanj FYI I haven’t had any freezes since while remaining on the freeworld-24.0.5 drivers (the non-freeworld packages have been on 24.0.6 for a while), so it might’ve been fixed from some other update. Nonetheless,

  • mesa-va-drivers-freeworld-24.0.6-1.fc40.x86_64
  • mesa-vdpau-drivers-freeworld-24.0.6-1.fc40.x86_64

have landed in rpmfusion-free-updates and I just updated. Will report back if I experience again but so far it’s been stable on my end.

Also, oddly dnf wasn’t refreshing/pulling the latest rpmfusion-free-updates updates, so I did a combination of uninstalling/reinstalling as well as

sudo dnf install --enablerepo=rpmfusion-free-updates  mesa-va-drivers-freeworld

without proceeding.

It then picked up the 24.0.6 version for both mesa-va-drivers-freeworld and mesa-vdpau-drivers-freeworld, and I was able to install them normally without the --enablerepo=rpmfusion-free-updates option.

Edit: 05-05-2024
Welp I’ve played plenty of YouTube videos over Firefox with no issues, but I just got the same crash mousing over a video on Amazon. Seems it may not have been fixed with the 24.0.6 freeworld drivers. I can’t repro with the above ffmpeg command. Updated to Firefox 125.0.3-1.fc40, TBD, may file an issue.

1 Like

Hey! I just had the same issue and I tried MrThees’ fix but it didn’t work for me. In my case, I had to execute the command recommended on this Fedora thread. Worked like a charm.

1 Like

i’ve been having hard freezes on a few distro’s i’ve played with over the last 6 weeks. kubuntu, KDE neon, and F40 kde. seems less of a problem on F40.

when the issue occurs, its almost always at resume-from-sleep (laptop is 99% in suspend,rather than shutdown).
however today i had it using superslicer appimage with simply resizing to half screen. had it crash and get (iirc) a GPU recovery popup. close and re-opened superslicer and the problem hasnt occured.

i’ve also seen the brightness work fine at reboot but after a few suspends, brightness keys do not work anymore. i’m assuming this is related to the same issue, even if there seems to be no related crash or hint it’s broken itself.

i’ll bug it with fedora if i can get more info on it…

I’ve been getting some of the random hard-freezes as well, but no kernel panic or logs at all so it’s impossible to debug.
The displays go dark and any audio that was playing (from a game, video or music) continues playing, but the laptop is completely unresponsive and needs to be force-powered off and I can’t find any reliable way to recreate it.
Note that in all these instances I’m using it through a TB4 dock so I’m not using the touchpad or builtin keyboad.

Experienced a hard freeze today as well. I was connected to external display via USB-C and the built-in display was disabled. I was using Firefox and listening to a youtube video on another tab when the external display briefly went black before working again, but everything was frozen apart from the audio playback. I am using the latest freeworld mesa drivers (24.0.6). Will swap back for now.

Reminder: When you experience a freeze, write down the time.

Then check your logs according to the time stamp.

Example:
journalctl --since "2024-05-16 08:00:00" --until "2024-05-16 10:00:00"

This gives us a jumping off point to determine where things fell off at.

OK, I’ve finally found a reproducible way to trigger the complete system freeze, and it’s… an amdgpu bug, because of course it is.

Symptoms are always:

  • Any sound playing continues to play
  • Screen goes dark, keyboard is completely unresponsive (e.g. no capslock)
  • EC is up and running, can change keyboard backlight
  • Any open network sockets and hardware still stay up (SSH sessions still established)
  • Can force an emergency sync over SSH but can’t even shut down cleanly, always needs to be hard powered off

The freeze happens to me at least daily, but if I run through a powerpoint deck with videos I can always get it to trigger within 10 minutes - so I need to use my old laptop unfortunately for any presentations.

It doesn’t seem to matter if I have freeworld or stock Mesa.

Fedora 40 is completely vanilla and up-to-date.

The freeze sometimes just happens even when there’s no video content playing. It’s a matter of time, but having video playing will make it want to come out and play sooner.

It’s always happened with an external display (via a TB4 dock or the HDMI card or a USB-C dongle with HDMI) - I haven’t tested it on just the internal display but will try that tomorrow.
I can’t seem to reproduce this on the internal screen - both at my desk (TB4 dock to monitor) and in the lounge (FW HDMI card to TV) where the issue can be tripped the display is connected via an HDMI port. I wonder if this is or increases the risk of the problem happening? I should test via the DP expansion card during the week.

I had dmesg streaming into a serial console to capture the crash (since it never flushes the logs to disk):

[  134.055722] rc rc0: DP-4 as /devices/pci0000:00/0000:00:08.1/0000:c1:00.0/rc/rc0
[  134.055774] input: DP-4 as /devices/pci0000:00/0000:00:08.1/0000:c1:00.0/rc/rc0/input15[  134.062390] usb 1-1: New USB device found, idVendor=32ac, idProduct=0002, bcdDevice= 0.00
[  134.062398] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[  134.062401] usb 1-1: Product: HDMI Expansion Card
[  134.062404] usb 1-1: Manufacturer: Framework
[  134.062407] usb 1-1: SerialNumber: 11AD1D004095401821270B00
[  134.142059] hid-generic 0003:32AC:0002.0005: hiddev96,hidraw0: USB HID v1.11 Device [Framework HDMI Expansion Card] on usb-0000:c1:00.3-1/input1
[  727.853581] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=89703, emitted seq=89705
[  727.854033] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process  pid 0 thread  pid 0
[  727.854374] amdgpu 0000:c1:00.0: amdgpu: GPU reset begin!
[  733.669549] amdgpu 0000:c1:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001A SMN_C2PMSG_82:0x00000000
[  733.669561] amdgpu 0000:c1:00.0: amdgpu: Failed to disable gfxoff!
[  736.107506] ------------[ cut here ]------------
[  736.107514] WARNING: CPU: 0 PID: 929 at drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn314/dcn314_smu.c:159 dcn314_smu_send_msg_with_param+0x108/0x1b0 [amdgpu]
[  736.108015] Modules linked in: rfcomm snd_seq_dummy snd_hrtimer uhid nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables qrtr bnep sunrpc binfmt_misc vfat fat snd_sof_amd_acp63 snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_soc_core snd_hda_codec_realtek snd_compress mt7921e snd_hda_codec_generic ac97_bus snd_hda_codec_hdmi intel_rapl_msr mt7921_common snd_pcm_dmaengine snd_hda_intel intel_rapl_common mt792x_lib snd_pci_ps snd_intel_dspcfg mt76_connac_lib cros_ec_lpcs snd_intel_sdw_acpi mt76 edac_mce_amd cros_ec snd_rpl_pci_acp6x snd_hda_codec snd_acp_pci snd_hda_core btusb mac80211 snd_acp_legacy_common snd_hwdep snd_pci_acp6x kvm_amd btrtl snd_seq btintel hid_sensor_als hid_sensor_trigger snd_pci_acp5x btbcm libarc4 snd_seq_device kvm btmtk
[  736.108124]  hid_sensor_iio_common irqbypass bluetooth cfg80211 snd_pcm snd_rn_pci_acp3x snd_acp_config industrialio_triggered_buffer snd_timer snd_soc_acpi snd kfifo_buf industrialio amd_pmf wmi_bmof pcspkr soundcore snd_pci_acp3x thunderbolt rapl rfkill amdtee k10temp i2c_piix4 amd_sfh tee platform_profile amd_pmc joydev loop nfnetlink zram dm_crypt amdgpu amdxcp i2c_algo_bit drm_ttm_helper ttm drm_exec gpu_sched nvme drm_suballoc_helper drm_buddy nvme_core drm_display_helper crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni video ucsi_acpi polyval_generic hid_sensor_hub hid_multitouch ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 typec_ucsi ccp sp5100_tco cec typec nvme_auth wmi i2c_hid_acpi i2c_hid serio_raw ip6_tables ip_tables fuse i2c_dev
[  736.108230] CPU: 0 PID: 929 Comm: kworker/u32:16 Not tainted 6.8.9-300.fc40.x86_64 #1
[  736.108235] Hardware name: Framework Laptop 13 (AMD Ryzen 7040Series)/FRANMDCP07, BIOS 03.05 03/29/2024
[  736.108238] Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched]
[  736.108252] RIP: 0010:dcn314_smu_send_msg_with_param+0x108/0x1b0 [amdgpu]
[  736.108659] Code: be 93 62 01 00 5d 41 5c 41 5d e9 13 a8 de ff 44 89 ea 48 c7 c6 28 19 3a c1 48 c7 c7 30 d5 ef c0 e8 5d 1e e7 e5 e9 48 ff ff ff <0f> 0b 48 8b 3b b9 80 84 1e 00 44 89 e2 89 ee e8 94 5c df ff eb b5
[  736.108664] RSP: 0018:ffffb52c81667878 EFLAGS: 00010246
[  736.108669] RAX: 0000023b6f69d5e6 RBX: ffff993991873800 RCX: 0000000000000000
[  736.108672] RDX: 0000000000008bfe RSI: 00000000000080ac RDI: 0000023b6f6949e8
[  736.108676] RBP: 000000000000000d R08: 0000000000000000 R09: ffffb52c816677f0
[  736.108678] R10: 0000000000000000 R11: 0000000000000908 R12: 0000000000000000
[  736.108680] R13: 0000000000000000 R14: ffff993981d34488 R15: ffff993b1e380908
[  736.108683] FS:  0000000000000000(0000) GS:ffff9947e1a00000(0000) knlGS:0000000000000000
[  736.108686] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  736.108689] CR2: 00005605866c7cc8 CR3: 0000000da8428000 CR4: 0000000000f50ef0
[  736.108692] PKRU: 55555554
[  736.108694] Call Trace:
[  736.108699]  <TASK>
[  736.108701]  ? dcn314_smu_send_msg_with_param+0x108/0x1b0 [amdgpu]
[  736.109157]  ? __warn+0x81/0x130
[  736.109166]  ? dcn314_smu_send_msg_with_param+0x108/0x1b0 [amdgpu]
[  736.109555]  ? report_bug+0x16f/0x1a0
[  736.109566]  ? handle_bug+0x3c/0x80
[  736.109571]  ? exc_invalid_op+0x17/0x70
[  736.109575]  ? asm_exc_invalid_op+0x1a/0x20
[  736.109585]  ? dcn314_smu_send_msg_with_param+0x108/0x1b0 [amdgpu]
[  736.109965]  ? dcn314_smu_send_msg_with_param+0xae/0x1b0 [amdgpu]
[  736.110307]  link_set_dpms_off+0xfe/0x9d0 [amdgpu]
[  736.110710]  ? srso_alias_return_thunk+0x5/0xfbef5
[  736.110716]  ? generic_reg_set_ex+0xa8/0xf0 [amdgpu]
[  736.111088]  ? srso_alias_return_thunk+0x5/0xfbef5
[  736.111091]  ? optc31_set_drr+0x128/0x1d0 [amdgpu]
[  736.111458]  dcn31_reset_hw_ctx_wrap+0x218/0x440 [amdgpu]
[  736.111843]  dce110_apply_ctx_to_hw+0x4e/0x320 [amdgpu]
[  736.112247]  dc_commit_state_no_check+0x5f3/0x1910 [amdgpu]
[  736.112598]  dc_commit_streams+0x299/0x580 [amdgpu]
[  736.112953]  ? srso_alias_return_thunk+0x5/0xfbef5
[  736.112967]  dm_suspend+0x214/0x270 [amdgpu]
[  736.113352]  amdgpu_device_ip_suspend_phase1+0x9c/0x1a0 [amdgpu]
[  736.113614]  amdgpu_device_ip_suspend+0x29/0x70 [amdgpu]
[  736.113913]  amdgpu_device_pre_asic_reset+0xcd/0x430 [amdgpu]
[  736.114165]  amdgpu_device_gpu_recover+0x442/0xd00 [amdgpu]
[  736.114412]  ? __drm_err+0x7d/0xa0
[  736.114421]  amdgpu_job_timedout+0x187/0x270 [amdgpu]
[  736.114769]  ? __cancel_work_timer+0x103/0x1a0
[  736.114778]  drm_sched_job_timedout+0x73/0x100 [gpu_sched]
[  736.114791]  process_one_work+0x16f/0x330
[  736.114797]  worker_thread+0x273/0x3c0
[  736.114804]  ? __pfx_worker_thread+0x10/0x10
[  736.114808]  kthread+0xe5/0x120
[  736.114813]  ? __pfx_kthread+0x10/0x10
[  736.114818]  ret_from_fork+0x31/0x50
[  736.114824]  ? __pfx_kthread+0x10/0x10
[  736.114828]  ret_from_fork_asm+0x1b/0x30
[  736.114837]  </TASK>
[  736.114840] ---[ end trace 0000000000000000 ]---
[  741.164969] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[  741.165339] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[  741.311959] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[  741.312271] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[  741.458918] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[  741.459215] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[  741.605745] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[  741.606038] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[  741.752558] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[  741.752852] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[  741.899463] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[  741.899758] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[  742.046273] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[  742.046580] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[  742.193156] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[  742.193445] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[  742.340028] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[  742.340316] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[  742.342053] amdgpu 0000:c1:00.0: amdgpu: MODE2 reset
[  748.164756] amdgpu 0000:c1:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001A SMN_C2PMSG_82:0x00000000
[  748.164762] amdgpu 0000:c1:00.0: amdgpu: Mode2 reset failed!
[  748.164766] amdgpu 0000:c1:00.0: amdgpu: ASIC reset failed with error, -62 for drm dev, 0000:c1:00.0
[  748.164788] amdgpu 0000:c1:00.0: amdgpu: GPU reset(1) failed
[  748.164820] [drm] Skip scheduling IBs!
[  748.164822] amdgpu 0000:c1:00.0: amdgpu: GPU reset end with ret = -62
[  748.164842] [drm] Skip scheduling IBs!
[  748.164844] [drm] Skip scheduling IBs!
[  748.164852] [drm] Skip scheduling IBs!
[  748.164857] [drm] Skip scheduling IBs!
[  748.164860] [drm] Skip scheduling IBs!
[  748.164864] [drm] Skip scheduling IBs!
[  748.164868] [drm] Skip scheduling IBs!
[  748.164874] [drm] Skip scheduling IBs!
[  748.164877] [drm] Skip scheduling IBs!
[  748.164881] [drm] Skip scheduling IBs!
[  748.164884] [drm] Skip scheduling IBs!
[  748.164891] [drm] Skip scheduling IBs!
[  748.164826] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* GPU Recovery Failed: -62
[  748.165366] [drm] Skip scheduling IBs!
[  748.216063] [drm] Skip scheduling IBs!

So basically because of the AMD graphics flakiness it’s still not quite ready to daily drive on Linux :frowning:

I’ll try booting Windows from an expansion card and see if it’s more stable - that should at least let me know if I’ve just happened to lose the silicon lottery or if amdgpu code is still pretty buggy.

EDIT: some more system/version info:

sudo dnf info amd-gpu-firmware                                 
Last metadata expiration check: 0:32:05 ago on Sat 18 May 2024 11:05:17.
Installed Packages
Name         : amd-gpu-firmware
Version      : 20240513
Release      : 1.fc40
Architecture : noarch
Size         : 19 M
Source       : linux-firmware-20240513-1.fc40.src.rpm
Repository   : @System
From repo    : updates
Summary      : Firmware for AMD GPUs
URL          : http://www.kernel.org/
License      : Redistributable, no modification permitted
Description  : Firmware for AMD amdgpu and radeon GPUs.

journalctl -b -k --grep amdgpu                                 
May 18 11:19:33 kronk kernel: [drm] amdgpu kernel modesetting enabled.
May 18 11:19:33 kronk kernel: amdgpu: Virtual CRAT table created for CPU
May 18 11:19:33 kronk kernel: amdgpu: Topology: Add CPU node
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: amdgpu: Fetched VBIOS from VFCT
May 18 11:19:33 kronk kernel: amdgpu: ATOM BIOS: 113-PHXGENERIC-001
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: [drm:jpeg_v4_0_early_init [amdgpu]] JPEG decode is enabled in VM mode
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: vgaarb: deactivate vga console
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: amdgpu: Trusted Memory Zone (TMZ) feature enabled
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: amdgpu: VRAM: 4096M 0x0000008000000000 - 0x00000080FFFFFFFF (4096M used)
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: amdgpu: GART: 512M 0x00007FFF00000000 - 0x00007FFF1FFFFFFF
May 18 11:19:33 kronk kernel: [drm] amdgpu: 4096M of VRAM memory ready
May 18 11:19:33 kronk kernel: [drm] amdgpu: 30033M of GTT memory ready.
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: amdgpu: Will use PSP to load VCN firmware
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: amdgpu: RAS: optional ras ta ucode is not available
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: amdgpu: RAP: optional rap ta ucode is not available
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: amdgpu: SMU is initialized successfully!
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: [drm:jpeg_v4_0_hw_init [amdgpu]] JPEG decode initialized successfully.
May 18 11:19:33 kronk kernel: amdgpu: HMM registered 4096MB device memory
May 18 11:19:33 kronk kernel: kfd kfd: amdgpu: Allocated 3969056 bytes on gart
May 18 11:19:33 kronk kernel: kfd kfd: amdgpu: Total number of KFD nodes to be created: 1
May 18 11:19:33 kronk kernel: amdgpu: Virtual CRAT table created for GPU
May 18 11:19:33 kronk kernel: amdgpu: Topology: Add dGPU node [0x15bf:0x1002]
May 18 11:19:33 kronk kernel: kfd kfd: amdgpu: added device 1002:15bf
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: amdgpu: SE 1, SH per SE 2, CU per SH 6, active_cu_number 12
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: amdgpu: ring jpeg_dec uses VM inv eng 1 on hub 8
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0
May 18 11:19:33 kronk kernel: [drm] Initialized amdgpu 3.57.0 20150101 for 0000:c1:00.0 on minor 1
May 18 11:19:33 kronk kernel: fbcon: amdgpudrmfb (fb0) is primary device
May 18 11:19:33 kronk kernel: amdgpu 0000:c1:00.0: [drm] fb0: amdgpudrmfb frame buffer device
May 18 11:19:36 kronk kernel: snd_hda_intel 0000:c1:00.1: bound 0000:c1:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
1 Like

After also having no audio output devices listed anymore, I used the recommended pactl commands in the linked article to solve the problem. Thank you!

i cant help but think theres multiple issues here.

so i dont have a log i can get right now, but i’ve been doing log-watching and getting zero info on the resume. the last log (according to journalctl -b-1) is the enter-sleep stage. the resume has nothing in the logs at all. i’ve considered NVME issue, but i’d expect some errors related to journal, FS, mount points or nvme0 device…SOMETHING rather than complete silence.

some mention of AMDGPU, but since i’m on the i5-12 then that wont be it for me.

i completely forgot about this thread despite posting in it recently, but i’ve created my own specific thread for my issue since it doesnt really match with other owners experiences. 12gen hard freeze on resume using KDE

PSA for folks experiencing freezes: please consider booting with a previous version of the kernel. YMMW, but I have experienced freezing issues twice since I’ve owned a Framework, and both times booting with a previous version fixed the issue for me.

As of this note, I get a blank screen with an unresponsive computer on wake-from-sleep frequently with the latest version of the kernel on Fedora (6.8.9-300.fc40). However, I’ve not experienced a freeze running with version 6.8.8-300.fc40.x86_64.

The Grub2 boot menu can be enabled as follows:

sudo grub2-editenv - unset menu_auto_hide

With that, you can choose the last known “good” kernel for your env to give it a shot, and, if it works, hold out until, hopefully, a future kernel update fixes the issue.

It seems that there is an issue when you plug HMDI device it going to be default output and if you did not change it before unplug the HDMI it will not play sound on the FW speakers.

EDIT: the problem for me is no longer occurring after latest updates.

There is an issue that affects color accuracy on the Fedora 40 & Framework 13 AMD when using the default power-profile-daemon. It now decreases color accuracy to save power in power saver and balanced profiles.

Currently, there is no direct way to disable this feature through the GUI. It may be beneficial to include this information in the Fedora 40 setup guide Fedora40-amd-fw13.md so that users are aware of this feature and know how to disable it if they prefer.

This sounds quite similar (but not exactly) like my issue:
When I boot F40 on my framework and have the dock attached while sddm runs, it always hard-freezes the first time. A hard power-off and reboot almost always works. Sounds crazy, but thats how it behaves.
I tried plugging the dock in first, before power up, and also tried letting it boot to sddm without the dock first, and plugging in the dock just when the login prompt appears. Both show the same behaviour. But I get dumped kernel panics in journald, its always the exact same spot:

Jul 06 10:15:31 fedora kernel: BUG: kernel NULL pointer dereference, address: 0000000000000920
Jul 06 10:15:31 fedora kernel: #PF: supervisor read access in kernel mode
Jul 06 10:15:31 fedora kernel: #PF: error_code(0x0000) - not-present page
Jul 06 10:15:31 fedora kernel: PGD 0 P4D 0 
Jul 06 10:15:31 fedora kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Jul 06 10:15:31 fedora kernel: CPU: 14 PID: 1855 Comm: kwin_wayland Not tainted 6.9.6-200.fc40.x86_64 #1
Jul 06 10:15:31 fedora kernel: Hardware name: Framework Laptop 13 (AMD Ryzen 7040Series)/FRANMDCP07, BIOS 03.05 03/29/2024
Jul 06 10:15:31 fedora kernel: RIP: 0010:is_dsc_need_re_compute+0xef/0x3b0 [amdgpu]
Jul 06 10:15:31 fedora kernel: Code: 04 dc 48 85 c0 74 dd 48 39 50 08 75 d7 48 8b a8 60 04 00 00 48 85 ed 74 cb 48 83 bd f8 11 00 00 00 74 c1 48 8b 85 e8 07 00 00 <80> b8 20 09 00 00 00 75 10 48 8b 85 f0 07 00 00 f6 80 84 02 00 00
Jul 06 10:15:31 fedora kernel: RSP: 0018:ffffa48985257960 EFLAGS: 00010286
Jul 06 10:15:31 fedora kernel: RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff90da8d080000
Jul 06 10:15:31 fedora kernel: RDX: ffff90da8bb49800 RSI: 0000000000000001 RDI: ffffa489852579b0
Jul 06 10:15:31 fedora kernel: RBP: ffff90da8a1fe000 R08: ffffa48985257980 R09: 0000000000000000
Jul 06 10:15:31 fedora kernel: R10: ffffa48985257a22 R11: ffff90da9cc43800 R12: ffff90dad1800000
Jul 06 10:15:31 fedora kernel: R13: ffff90daaf38a080 R14: 0000000000000000 R15: 0000000000000003
Jul 06 10:15:31 fedora kernel: FS:  00007f0f1dda5b40(0000) GS:ffff90e1de900000(0000) knlGS:0000000000000000
Jul 06 10:15:31 fedora kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 06 10:15:31 fedora kernel: CR2: 0000000000000920 CR3: 000000013041e000 CR4: 0000000000f50ef0
Jul 06 10:15:31 fedora kernel: PKRU: 55555554
Jul 06 10:15:31 fedora kernel: Call Trace:
Jul 06 10:15:31 fedora kernel:  <TASK>
Jul 06 10:15:31 fedora kernel:  ? __die_body.cold+0x19/0x27
Jul 06 10:15:31 fedora kernel:  ? page_fault_oops+0x15a/0x2c0
Jul 06 10:15:31 fedora kernel:  ? exc_page_fault+0x7e/0x180
Jul 06 10:15:31 fedora kernel:  ? asm_exc_page_fault+0x26/0x30
Jul 06 10:15:31 fedora kernel:  ? is_dsc_need_re_compute+0xef/0x3b0 [amdgpu]
Jul 06 10:15:31 fedora kernel:  pre_validate_dsc+0x22a/0x730 [amdgpu]
Jul 06 10:15:31 fedora kernel:  ? dm_update_crtc_state+0x420/0x810 [amdgpu]
Jul 06 10:15:31 fedora kernel:  amdgpu_dm_atomic_check+0x8f0/0x1570 [amdgpu]
Jul 06 10:15:31 fedora kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Jul 06 10:15:31 fedora kernel:  ? __ww_mutex_lock.constprop.0+0x5b/0x9a0
Jul 06 10:15:31 fedora kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Jul 06 10:15:31 fedora kernel:  drm_atomic_check_only+0x633/0xae0
Jul 06 10:15:31 fedora kernel:  drm_mode_atomic_ioctl+0x853/0xd20
Jul 06 10:15:31 fedora kernel:  ? __pfx_drm_mode_atomic_ioctl+0x10/0x10
Jul 06 10:15:31 fedora kernel:  drm_ioctl_kernel+0xb0/0x100
Jul 06 10:15:31 fedora kernel:  drm_ioctl+0x28b/0x540
Jul 06 10:15:31 fedora kernel:  ? __pfx_drm_mode_atomic_ioctl+0x10/0x10
Jul 06 10:15:31 fedora kernel:  amdgpu_drm_ioctl+0x4e/0x90 [amdgpu]
Jul 06 10:15:31 fedora kernel:  __x64_sys_ioctl+0x94/0xd0
Jul 06 10:15:31 fedora kernel:  do_syscall_64+0x82/0x160
Jul 06 10:15:31 fedora kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Jul 06 10:15:31 fedora kernel:  ? syscall_exit_to_user_mode+0x75/0x230
Jul 06 10:15:31 fedora kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Jul 06 10:15:31 fedora kernel:  ? do_syscall_64+0x8e/0x160
Jul 06 10:15:31 fedora kernel:  ? do_syscall_64+0x8e/0x160
Jul 06 10:15:31 fedora kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Jul 06 10:15:31 fedora kernel:  ? syscall_exit_to_user_mode+0x75/0x230
Jul 06 10:15:31 fedora kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Jul 06 10:15:31 fedora kernel:  ? do_syscall_64+0x8e/0x160
Jul 06 10:15:31 fedora kernel:  ? do_syscall_64+0x8e/0x160
Jul 06 10:15:31 fedora kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Jul 06 10:15:31 fedora kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Jul 06 10:15:31 fedora kernel:  entry_SYSCALL_64_after_hwframe+0x76/0x7e
Jul 06 10:15:31 fedora kernel: RIP: 0033:0x7f0f23f26d5d
Jul 06 10:15:31 fedora kernel: Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00
Jul 06 10:15:31 fedora kernel: RSP: 002b:00007ffde49ff3e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Jul 06 10:15:31 fedora kernel: RAX: ffffffffffffffda RBX: 000055e15a03de60 RCX: 00007f0f23f26d5d
Jul 06 10:15:31 fedora kernel: RDX: 00007ffde49ff4d0 RSI: 00000000c03864bc RDI: 0000000000000011
Jul 06 10:15:31 fedora kernel: RBP: 00007ffde49ff430 R08: 000055e15a03e088 R09: 0000000000000007
Jul 06 10:15:31 fedora kernel: R10: 000055e15a03e010 R11: 0000000000000246 R12: 00007ffde49ff4d0
Jul 06 10:15:31 fedora kernel: R13: 00000000c03864bc R14: 0000000000000011 R15: 0000000000000004
Jul 06 10:15:31 fedora kernel:  </TASK>
Jul 06 10:15:31 fedora kernel: Modules linked in: snd_usb_audio snd_usbmidi_lib snd_ump snd_rawmidi mc r8153_ecm cdc_ether usbnet r8152 mii rfcomm snd_seq_dummy snd_hrtimer nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 n>
Jul 06 10:15:31 fedora kernel:  snd_rpl_pci_acp6x mac80211 snd_acp_pci snd_hda_core snd_acp_legacy_common cros_ec_lpcs snd_pci_acp6x snd_hwdep cros_ec snd_seq snd_seq_device hid_sensor_als libarc4 hid_sensor_trigger snd_pci_acp5x snd_pcm rapl hid_sensor_iio_common industrialio_triggered_buffer wmi_bmof cfg80211 k>
Jul 06 10:15:31 fedora kernel: CR2: 0000000000000920
Jul 06 10:15:31 fedora kernel: ---[ end trace 0000000000000000 ]---

I have multiple of these dumps and all of them crash in is_dsc_need_re_compute+0xef/0xb. There have been some additional fixes to that function in 6.10 that could (in my basic understanding of c) prevent null dereferences. I’ll likely wait for that to hit f40 and see if it fixes the issue.

I have the same issue. Unplugging the dock and rebooting after an update and then plugging it back in…just works. Laving it plugged in and you run into an issue where it slow boots, hits the display manager and never get to the graphical target. Ran into this same problem months ago, and it ended up being one of the xdg packages.

Nearly 2 months later and still getting hard unrecoverable freezes under Linux:

2024-07-10T04:41:02.084839Z     [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring vcn_unified_0 timeout, signaled seq=792, emitted seq=794
2024-07-10T04:41:02.085072Z     [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process RDD Process pid 5302 thread firefox-bi:cs0 pid 12020
2024-07-10T04:41:02.085097Z     amdgpu 0000:c1:00.0: amdgpu: GPU reset begin!
2024-07-10T04:41:08.333876Z     amdgpu 0000:c1:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001A SMN_C2PMSG_82:0x00000000
2024-07-10T04:41:08.334688Z     amdgpu 0000:c1:00.0: amdgpu: Failed to disable gfxoff!
2024-07-10T04:41:08.926855Z     [drm] Register(0) [regUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002n
2024-07-10T04:41:09.223852Z     [drm] Register(0) [regUVD_RB_RPTR] failed to reach value 0x00000000 != 0x00000380n
2024-07-10T04:41:10.718284Z     [drm] Register(0) [regUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002n
2024-07-10T04:41:10.718404Z     ------------[ cut here ]------------
2024-07-10T04:41:10.718426Z     WARNING: CPU: 6 PID: 34205 at drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn314/dcn314_smu.c:159 dcn314_smu_send_msg_with_param+0x108/0x190 [amdgpu]
2024-07-10T04:41:10.718453Z     Modules linked in: hid_logitech_hidpp snd_usb_audio snd_usbmidi_lib snd_ump snd_rawmidi hid_logitech_dj mc r8153_ecm cdc_ether usbnet r8152 mii vfio_pci vfio_pci_core vfio_iommu_type1 vfio iommufd virtiofs tun xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nft_compat nf_nat_tftp nf_conntrack_tftp bridge stp llc uinput rfcomm snd_seq_dummy snd_hrtimer uhid nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables qrtr bnep sunrpc binfmt_misc vfat fat snd_sof_amd_acp63 snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_pci_ps snd_amd_sdw_acpi snd_hda_codec_realtek soundwire_amd soundwire_generic_allocation snd_hda_codec_generic soundwire_bus snd_hda_scodec_component snd_hda_codec_hdmi snd_soc_core btusb intel_rapl_msr mt7921e btrtl amd_atl
2024-07-10T04:41:10.718532Z      btintel mt7921_common intel_rapl_common snd_hda_intel btbcm snd_compress mt792x_lib snd_intel_dspcfg btmtk edac_mce_amd ac97_bus snd_intel_sdw_acpi mt76_connac_lib snd_pcm_dmaengine snd_hda_codec bluetooth snd_rpl_pci_acp6x mt76 snd_acp_pci kvm_amd snd_hda_core cros_ec_lpcs uas snd_acp_legacy_common usb_storage cros_ec snd_pci_acp6x snd_hwdep hid_sensor_als kvm mac80211 snd_seq hid_sensor_trigger wmi_bmof snd_seq_device hid_sensor_iio_common rapl industrialio_triggered_buffer kfifo_buf libarc4 pcspkr snd_pcm industrialio snd_pci_acp5x snd_rn_pci_acp3x cfg80211 snd_acp_config snd_timer snd_soc_acpi snd amd_pmf thunderbolt soundcore snd_pci_acp3x amdtee k10temp amd_sfh rfkill i2c_piix4 tee platform_profile amd_pmc joydev loop nfnetlink zram dm_crypt amdgpu amdxcp i2c_algo_bit drm_ttm_helper ttm crct10dif_pclmul drm_exec crc32_pclmul crc32c_intel polyval_clmulni gpu_sched polyval_generic nvme drm_suballoc_helper drm_buddy nvme_core ghash_clmulni_intel drm_display_helper sha512_ssse3 hid_multitouch video
2024-07-10T04:41:10.718564Z      sha256_ssse3 ucsi_acpi hid_sensor_hub ccp cec typec_ucsi sha1_ssse3 nvme_auth sp5100_tco typec wmi i2c_hid_acpi i2c_hid serio_raw ip6_tables ip_tables fuse i2c_dev
2024-07-10T04:41:10.718583Z     CPU: 6 PID: 34205 Comm: kworker/u64:2 Not tainted 6.9.7-200.fc40.x86_64 #1
2024-07-10T04:41:10.718606Z     Hardware name: Framework Laptop 13 (AMD Ryzen 7040Series)/FRANMDCP07, BIOS 03.05 03/29/2024
2024-07-10T04:41:10.718623Z     Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched]
2024-07-10T04:41:10.718937Z     RIP: 0010:dcn314_smu_send_msg_with_param+0x108/0x190 [amdgpu]
2024-07-10T04:41:10.718978Z     Code: be 93 62 01 00 5d 41 5c 41 5d e9 b3 7c de ff 44 89 ea 48 c7 c6 08 c5 3f c1 48 c7 c7 c0 aa f2 c0 e8 4d d3 e7 e4 e9 48 ff ff ff <0f> 0b 48 8b 3b b9 80 84 1e 00 44 89 e2 89 ee e8 74 30 df ff eb b5
2024-07-10T04:41:10.718999Z     RSP: 0018:ffffb4e40117f8b8 EFLAGS: 00010246
2024-07-10T04:41:10.719022Z     RAX: 0000b26e6a2c8b6b RBX: ffff9b3fc5bec400 RCX: 0000000000000006
2024-07-10T04:41:10.719044Z     RDX: 0000000000008ad5 RSI: 00000000000080a9 RDI: 0000b26e6a2c0096
2024-07-10T04:41:10.719062Z     RBP: 000000000000000d R08: 0000000000000000 R09: ffffb4e40117f830
2024-07-10T04:41:10.719079Z     R10: 0000000000000000 R11: 0000000000010000 R12: 0000000000000000
2024-07-10T04:41:10.719096Z     R13: 0000000000000000 R14: ffff9b3fd0049ff8 R15: ffff9b4446200908
2024-07-10T04:41:10.719113Z     FS:  0000000000000000(0000) GS:ffff9b4e21d00000(0000) knlGS:0000000000000000
2024-07-10T04:41:10.719131Z     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
2024-07-10T04:41:10.719148Z     CR2: 00007f1b00de1000 CR3: 0000000afe428000 CR4: 0000000000f50ef0
2024-07-10T04:41:10.719165Z     PKRU: 55555554
2024-07-10T04:41:10.719182Z     Call Trace:
2024-07-10T04:41:10.719199Z      <TASK>
2024-07-10T04:41:10.720108Z      ? dcn314_smu_send_msg_with_param+0x108/0x190 [amdgpu]
2024-07-10T04:41:10.72015Z       ? __warn.cold+0x8e/0xe8
2024-07-10T04:41:10.720176Z      ? dcn314_smu_send_msg_with_param+0x108/0x190 [amdgpu]
2024-07-10T04:41:10.720207Z      ? handle_bug+0x3c/0x80
2024-07-10T04:41:10.720223Z      ? exc_invalid_op+0x17/0x70
2024-07-10T04:41:10.72024Z       ? asm_exc_invalid_op+0x1a/0x20
2024-07-10T04:41:10.720256Z      ? dcn314_smu_send_msg_with_param+0x108/0x190 [amdgpu]
2024-07-10T04:41:10.720843Z      ? dcn314_smu_send_msg_with_param+0xae/0x190 [amdgpu]
2024-07-10T04:41:10.720885Z      link_set_dpms_off+0xfe/0x980 [amdgpu]
2024-07-10T04:41:10.720904Z      ? srso_alias_return_thunk+0x5/0xfbef5
2024-07-10T04:41:10.72194Z       ? generic_reg_set_ex+0xa8/0xf0 [amdgpu]
2024-07-10T04:41:10.721986Z      ? srso_alias_return_thunk+0x5/0xfbef5
2024-07-10T04:41:10.722001Z      ? optc31_set_drr+0x128/0x1d0 [amdgpu]
2024-07-10T04:41:10.722019Z      dcn31_reset_hw_ctx_wrap+0x218/0x440 [amdgpu]
2024-07-10T04:41:10.722971Z      dce110_apply_ctx_to_hw+0x4e/0x320 [amdgpu]
2024-07-10T04:41:10.723012Z      dc_commit_state_no_check+0x618/0x1960 [amdgpu]
2024-07-10T04:41:10.723032Z      dc_commit_streams+0x299/0x5b0 [amdgpu]
2024-07-10T04:41:10.72305Z       ? srso_alias_return_thunk+0x5/0xfbef5
2024-07-10T04:41:10.723851Z      dm_suspend+0x214/0x270 [amdgpu]
2024-07-10T04:41:10.723887Z      amdgpu_device_ip_suspend_phase1+0x9a/0x180 [amdgpu]
2024-07-10T04:41:10.723908Z      amdgpu_device_ip_suspend+0x29/0x70 [amdgpu]
2024-07-10T04:41:10.724843Z      amdgpu_device_pre_asic_reset+0xcd/0x420 [amdgpu]
2024-07-10T04:41:10.724884Z      amdgpu_device_gpu_recover.cold+0x475/0xb44 [amdgpu]
2024-07-10T04:41:10.724908Z      amdgpu_job_timedout+0x18e/0x1d0 [amdgpu]
2024-07-10T04:41:10.724927Z      drm_sched_job_timedout+0x73/0x100 [gpu_sched]
2024-07-10T04:41:10.724991Z      process_one_work+0x186/0x340
2024-07-10T04:41:10.725039Z      worker_thread+0x278/0x3b0
2024-07-10T04:41:10.725067Z      ? __pfx_worker_thread+0x10/0x10
2024-07-10T04:41:10.725092Z      kthread+0xcf/0x100
2024-07-10T04:41:10.725117Z      ? __pfx_kthread+0x10/0x10
2024-07-10T04:41:10.725142Z      ret_from_fork+0x31/0x50
2024-07-10T04:41:10.725166Z      ? __pfx_kthread+0x10/0x10
2024-07-10T04:41:10.725185Z      ret_from_fork_asm+0x1a/0x30
2024-07-10T04:41:10.725209Z      </TASK>
2024-07-10T04:41:10.725234Z     ---[ end trace 0000000000000000 ]---
2024-07-10T04:41:10.823897Z     [drm] DMUB HPD IRQ callback: link_index=7
2024-07-10T04:41:11.811864Z     [drm] DMUB HPD IRQ callback: link_index=7
2024-07-10T04:41:15.468888Z     amdgpu 0000:c1:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001A SMN_C2PMSG_82:0x00000000
2024-07-10T04:41:15.469606Z     amdgpu 0000:c1:00.0: amdgpu: Failed to power gate VCN!
2024-07-10T04:41:15.46985Z      [drm:vcn_v4_0_stop [amdgpu]] *ERROR* Dpm disable uvd failed, ret = -62. 
2024-07-10T04:41:18.156869Z     [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
2024-07-10T04:41:18.157041Z     [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
2024-07-10T04:41:18.29886Z      [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
2024-07-10T04:41:18.298999Z     [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
2024-07-10T04:41:18.440978Z     [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
2024-07-10T04:41:18.441112Z     [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
2024-07-10T04:41:18.583856Z     [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
2024-07-10T04:41:18.583956Z     [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
2024-07-10T04:41:18.725833Z     [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
2024-07-10T04:41:18.725946Z     [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
2024-07-10T04:41:18.867833Z     [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
2024-07-10T04:41:18.867899Z     [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
2024-07-10T04:41:19.010147Z     [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
2024-07-10T04:41:19.010275Z     [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
2024-07-10T04:41:19.152852Z     [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
2024-07-10T04:41:19.152909Z     [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
2024-07-10T04:41:19.29485Z      [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
2024-07-10T04:41:19.294916Z     [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
2024-07-10T04:41:19.296847Z     amdgpu 0000:c1:00.0: amdgpu: MODE2 reset
2024-07-10T04:41:23.28689Z      ACPI Error: Aborting method \_SB.A018 due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529)
2024-07-10T04:41:23.287222Z     ACPI Error: Aborting method \_SB.ALIB due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529)
2024-07-10T04:41:23.287842Z     ACPI Error: Aborting method \_SB.PCI0.GP19.NHI0.PPS3 due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529)
2024-07-10T04:41:23.2879Z       ACPI Error: Aborting method \_SB.PCI0.GP19.NHI0._PS3 due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529)
2024-07-10T04:41:24.788851Z     amdgpu 0000:c1:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001A SMN_C2PMSG_82:0x00000000
2024-07-10T04:41:24.789484Z     amdgpu 0000:c1:00.0: amdgpu: Mode2 reset failed!
2024-07-10T04:41:24.789724Z     amdgpu 0000:c1:00.0: amdgpu: ASIC reset failed with error, -62 for drm dev, 0000:c1:00.0
2024-07-10T04:41:24.790172Z     amdgpu 0000:c1:00.0: amdgpu: GPU reset(1) failed
2024-07-10T04:41:24.790546Z     amdgpu 0000:c1:00.0: amdgpu: GPU reset end with ret = -62
2024-07-10T04:41:24.790898Z     [drm:amdgpu_job_timedout [amdgpu]] *ERROR* GPU Recovery Failed: -62

I’ve set up a logging server since none of this ever perists in the journal once it locks up, but existing networking and processes (like sound in the background) seems to be OK.

Other notes:

  • Is extremely reproducible if I let a LibreOffice presentation run on a loop and have hardware acceleration enabled
  • Never seems to happen under load - only if using the machine lightly, and almost always if it’s Firefox or LibreOffice
  • Seems to be completelty independent of power profile that’s set, even if I set the performance level of the GPU directly
  • Does not need any external display attached, and doesn’t seem to matter what expansion cards I use
  • Displays will either lock up completely and show the last static image, or just go black
  • Ctrl+Alt+F# does not function - even capslock doesn’t light up - the thing is mostly dead
  • Windows is completely unaffected - if I’m doing something important I’ve switched to using that for the time being

Is there anything at all I can do to further diagnose this? Should I raise a support case?

1 Like

Do you have a site or (sharable) document that reproduces?

I would suggest reporting to Linux AMDGPU upstream at drm / amd · GitLab but would also be good to see if I or someone else could reproduce.

Turns out I was somehow missing the command line tool used by the fedora script to change brightness. I installed via DNF and keys work again!

Hey I have the same issue, can I know which command line tool you are referring too?

dnf install light

I don’t know if they simply changed tool between 39 and 40. But I don’t recall changing anything manually.

Name : light
Version : 1.2.2
Release : 11.fc40
Architecture : x86_64
Size : 81 k
Source : light-1.2.2-11.fc40.src.rpm
Repository : @System
From repo : fedora
Summary : Control backlight controllers
URL : http://haikarainen.github.io/light/
License : GPL-3.0-only
Description : Light is a program to control backlight controllers under GNU/Linux,
: it is the successor of lightscript, which was a bash script
: with the same purpose, and tries to maintain the same functionality.
:
: Features
:
: - Works excellent where other software have been proven unusable
: or problematic, thanks to how it operates internally
: and the fact that it does not rely on X.
: - Can automatically figure out the best controller to use,
: making full use of underlying hardware.
: - Possibility to set a minimum brightness value, as some controllers
: set the screen to be pitch black at a value of 0 (or higher).

1 Like