FW13 AMD UI freeze

my FW13 AMD UI started to freeze this week. Not during wakeup, but normal use.

sometimes journal has DMCUB error sometimes not.

a bunch of 
amdgpu 0000:c1:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
amdgpu 0000:c1:00.0: [drm] *ERROR* [CRTC:79:crtc-0] flip_done timed out
again, a bunch of 
amdgpu 0000:c1:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
amdgpu 0000:c1:00.0: [drm] *ERROR* flip_done timed out
------------[ cut here ]------------
kernel: WARNING: CPU: 6 PID: 1488 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:9205 amdgpu_dm_atomic_commit_tail+0x392f/0x3a00 [amdgpu]
kernel: Modules linked in: snd_seq_midi snd_seq_dummy snd_seq_midi_event snd_seq ixgbe xfrm_algo mdio_devres libphy mdio dca sd_mod scsi_mod scsi_common uhid usbhid ipmi_devintf ipmi_msgh>
kernel:  gf128mul libarc4 snd_rpl_pci_acp6x hid_sensor_iio_common snd_seq_device snd_hwdep snd_pci_acp6x crypto_simd industrialio_triggered_buffer snd_pcm videobuf2_common amd_pmf cryptd >
kernel:  drm_ttm_helper xhci_pci hid_generic xhci_hcd ttm i2c_hid_acpi i2c_hid cros_ec_dev drm_kms_helper hid nvme cros_ec_lpcs usbcore cros_ec thunderbolt nvme_core crc32_pclmul drm crc3>
kernel: CPU: 6 UID: 0 PID: 1488 Comm: Xorg Not tainted 6.12.13-amd64 #1  Debian 6.12.13-1
kernel: Hardware name: Framework Laptop 13 (AMD Ryzen 7040Series)/FRANMDCP07, BIOS 03.05 03/29/2024
kernel: RIP: 0010:amdgpu_dm_atomic_commit_tail+0x392f/0x3a00 [amdgpu]
kernel: Code: d8 60 50 c1 e8 42 ea 86 ff e9 20 fe ff ff 49 8d 87 40 31 04 00 c6 85 38 fe ff ff 00 48 89 85 48 fe ff ff e9 f6 cc ff ff 0f 0b <0f> 0b e9 64 f3 ff ff 0f 0b e9 28 cd ff ff 0f >
kernel: RSP: 0018:ffffb0744298f798 EFLAGS: 00010002
kernel: RAX: 0000000000000286 RBX: 0000000000000286 RCX: ffff8b7e70fe0118
kernel: RDX: 0000000000000001 RSI: 0000000000000297 RDI: ffff8b7e79500178
kernel: RBP: ffffb0744298f9e0 R08: ffffb0744298f684 R09: 0000000000000000
kernel: R10: ffffb0744298f6f0 R11: ffffb0744298f6f4 R12: 0000000000000002
kernel: R13: 0000000000000000 R14: ffff8b8241256c00 R15: ffff8b7e70fe0000
kernel: FS:  00007f76e405eb00(0000) GS:ffff8b9481d00000(0000) knlGS:0000000000000000
kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 00007f8b25cde6d0 CR3: 000000012934e000 CR4: 0000000000f50ef0
kernel: PKRU: 55555554
kernel: Call Trace:
kernel:  <TASK>
kernel:  ? amdgpu_dm_atomic_commit_tail+0x392f/0x3a00 [amdgpu]
kernel:  ? __warn.cold+0x93/0xf6
kernel:  ? amdgpu_dm_atomic_commit_tail+0x392f/0x3a00 [amdgpu]
kernel:  ? report_bug+0xff/0x140
kernel:  ? handle_bug+0x58/0x90
kernel:  ? exc_invalid_op+0x17/0x70
kernel:  ? asm_exc_invalid_op+0x1a/0x20
kernel:  ? amdgpu_dm_atomic_commit_tail+0x392f/0x3a00 [amdgpu]
kernel:  ? amdgpu_dm_atomic_commit_tail+0x2c87/0x3a00 [amdgpu]
kernel:  commit_tail+0x91/0x130 [drm_kms_helper]
kernel:  drm_atomic_helper_commit+0x11a/0x140 [drm_kms_helper]
kernel:  drm_atomic_commit+0xa6/0xe0 [drm]
kernel:  ? __pfx___drm_printfn_info+0x10/0x10 [drm]
kernel:  drm_atomic_helper_set_config+0x74/0xb0 [drm_kms_helper]
kernel:  drm_mode_setcrtc+0x46c/0x8a0 [drm]
kernel:  ? __pfx_drm_mode_setcrtc+0x10/0x10 [drm]
kernel:  drm_ioctl_kernel+0xad/0x100 [drm]
kernel:  drm_ioctl+0x277/0x4f0 [drm]
kernel:  ? __pfx_drm_mode_setcrtc+0x10/0x10 [drm]
kernel:  amdgpu_drm_ioctl+0x4b/0x80 [amdgpu]
kernel:  __x64_sys_ioctl+0x91/0xd0
kernel:  do_syscall_64+0x82/0x190
kernel:  ? vfs_write+0x311/0x450
kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
kernel:  ? vfs_write+0x311/0x450
kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
kernel:  ? syscall_exit_to_user_mode+0x164/0x210
kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
kernel:  ? do_syscall_64+0x8e/0x190
kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
kernel:  ? amdgpu_drm_ioctl+0x6e/0x80 [amdgpu]
kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
kernel:  ? syscall_exit_to_user_mode+0x164/0x210
kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
kernel:  ? do_syscall_64+0x8e/0x190
kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
kernel:  ? do_syscall_64+0x8e/0x190
kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
kernel:  entry_SYSCALL_64_after_hwframe+0x76/0x7e
kernel: RIP: 0033:0x7f76e43fe37b
kernel: Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1c 48 8b 44 24 18 64 >
kernel: RSP: 002b:00007ffc853ff2e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
kernel: RAX: ffffffffffffffda RBX: 0000560a1af48ea0 RCX: 00007f76e43fe37b
kernel: RDX: 00007ffc853ff370 RSI: 00000000c06864a2 RDI: 000000000000000f
kernel: RBP: 00007ffc853ff370 R08: 0000000000000000 R09: 0000560a1e1d9e50
kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c06864a2
kernel: R13: 000000000000000f R14: 0000560a1a0cab60 R15: 0000560a1a3fbdd0
kernel:  </TASK>
---[ end trace 0000000000000000 ]---
then a repeated PID 1488 crash
  • BIOS 3.05
  • Debian testing/unstable
  • kernel 6.12.13-amd64
  • XFCE 4.20.1 with xserver, no wayland

sudo cat /sys/kernel/debug/dri/0/amdgpu_gpu_recover always works, but I have to do this via ssh.
sometimes journal has DMCUB error sometimes not.

1 Like

additional info, may not be related.

bought a USB DAC this week, have to manipulate /etc/pulse/daemon.conf to get it to work at 192ksps 24/32-bit

Sounds similar to this issue: Fedora KDE becomes suddenly slow (despite the title, it’s not specific to Fedora or KDE).

If you’re running into the same problem, the workaround is to boot with the kernel parameter amdgpu.dcdebugmask=0x10, which disables panel self-refresh (PSR).

1 Like