Just an update to my earlier post: I had a second crash today, this time just looking at a static web page with no video. Unlike the first time there was no warning (stutter etc.); it went right to a black screen, after which the session died and dumped me to the login screen.
So the firmware update that I suggested may not help, or at least it does not solve the issue completely on its own. Maybe it did not mitigate the issue at all and I just got lucky the last 15 days (I had a crash sample size of one, after all, so it’s hard to say).
Attaching relevant dmesg dumps in case they are of any use. Since it happens so rarely so far, I suppose I’ll just keep updating the kernel, firmware, and mesa, and hope it eventually goes away as the drivers mature.
dmesg after second crash (straight black screen and return to login screen)
[1313482.367110] [drm:gfx_v11_0_priv_reg_irq [amdgpu]] *ERROR* Illegal register access in command stream
[1313482.377254] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=162105800, emitted seq=162105801
[1313482.377455] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process xfwm4 pid 1631 thread xfwm4:cs0 pid 1662
[1313482.377614] amdgpu 0000:c1:00.0: amdgpu: GPU reset begin!
[1313482.639824] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[1313482.639979] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[1313482.770442] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[1313482.770588] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[1313482.900729] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[1313482.900881] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[1313483.031316] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[1313483.031471] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[1313483.161527] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[1313483.161681] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[1313483.292560] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[1313483.292697] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[1313483.423550] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[1313483.423684] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[1313483.553906] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[1313483.554074] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[1313483.687542] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[1313483.687803] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[1313483.918409] [drm:gfx_v11_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
[1313483.919967] amdgpu 0000:c1:00.0: amdgpu: MODE2 reset
[1313483.944241] amdgpu 0000:c1:00.0: amdgpu: GPU reset succeeded, trying to resume
[1313483.944795] [drm] PCIE GART of 512M enabled (table at 0x00000080FFD00000).
[1313483.944948] amdgpu 0000:c1:00.0: amdgpu: SMU is resuming...
[1313483.946104] amdgpu 0000:c1:00.0: amdgpu: SMU is resumed successfully!
[1313483.947502] [drm] DMUB hardware initialized: version=0x08001B00
[1313484.344669] [drm] kiq ring mec 3 pipe 1 q 0
[1313484.346414] [drm] VCN decode and encode initialized successfully(under DPG Mode).
[1313484.346606] amdgpu 0000:c1:00.0: [drm:jpeg_v4_0_hw_init [amdgpu]] JPEG decode initialized successfully.
[1313484.347231] amdgpu 0000:c1:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[1313484.347234] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[1313484.347236] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[1313484.347238] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
[1313484.347239] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
[1313484.347241] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
[1313484.347243] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
[1313484.347244] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
[1313484.347246] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
[1313484.347248] amdgpu 0000:c1:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[1313484.347250] amdgpu 0000:c1:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
[1313484.347251] amdgpu 0000:c1:00.0: amdgpu: ring jpeg_dec uses VM inv eng 1 on hub 8
[1313484.347253] amdgpu 0000:c1:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0
[1313484.348748] amdgpu 0000:c1:00.0: amdgpu: recover vram bo from shadow start
[1313484.348751] amdgpu 0000:c1:00.0: amdgpu: recover vram bo from shadow done
[1313484.348782] [drm] Skip scheduling IBs!
[1313484.348810] [drm] Skip scheduling IBs!
[1313484.348817] [drm] Skip scheduling IBs!
... repeats ~80x
[1313484.348929] [drm] Skip scheduling IBs!
[1313484.348931] [drm] Skip scheduling IBs!
[1313484.348933] [drm] Skip scheduling IBs!
[1313484.349714] [drm] ring gfx_32787.1.1 was added
[1313484.350461] [drm] ring compute_32787.2.2 was added
[1313484.351227] [drm] ring sdma_32787.3.3 was added
[1313484.351252] [drm] ring gfx_32787.1.1 ib test pass
[1313484.351276] [drm] ring compute_32787.2.2 ib test pass
[1313484.351429] [drm] ring sdma_32787.3.3 ib test pass
[1313484.352642] amdgpu 0000:c1:00.0: amdgpu: GPU reset(2) succeeded!
[1313485.445108] [drm] Skip scheduling IBs!
[1313485.445121] [drm] Skip scheduling IBs!
[1313485.445125] [drm] Skip scheduling IBs!
... repeats ~200x
[1313485.446705] [drm] Skip scheduling IBs!
[1313485.446709] [drm] Skip scheduling IBs!
[1313485.446718] [drm] Skip scheduling IBs!
[1313486.222038] traps: xfsettingsd[3526186] trap int3 ip:7f28796c77c7 sp:7fffd2e6a2b0 error:0 in libglib-2.0.so.0.7400.6[7f2879687000+8d000]
[1313486.356385] Lockdown: Xorg: raw io port access is restricted; see man kernel_lockdown.7
[1313499.678933] Lockdown: systemd-logind: hibernation is restricted; see man kernel_lockdown.7
[1313499.706625] Lockdown: systemd-logind: hibernation is restricted; see man kernel_lockdown.7
[1313499.707161] Lockdown: systemd-logind: hibernation is restricted; see man kernel_lockdown.7
dmesg after first crash (stutter into brief black screens into perma black screen)
[ 6822.552580] [drm:gfx_v11_0_priv_reg_irq [amdgpu]] *ERROR* Illegal register access in command stream
[ 6822.562833] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=924088, emitted seq=924090
[ 6822.562990] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process firefox-bin pid 3289 thread firefox:cs0 pid 3768
[ 6822.563082] amdgpu 0000:c1:00.0: amdgpu: GPU reset begin!
[ 6822.569905] amdgpu_cs_ioctl: 18 callbacks suppressed
[ 6822.569908] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 6822.819497] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6822.819664] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6822.948409] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6822.948516] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6823.077224] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6823.077351] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6823.206148] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6823.206264] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6823.334968] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6823.335083] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6823.463890] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6823.464010] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6823.592763] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6823.592896] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6823.721666] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6823.721785] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6823.850546] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6823.850663] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6824.057790] [drm:gfx_v11_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
[ 6824.059299] amdgpu 0000:c1:00.0: amdgpu: MODE2 reset
[ 6824.068324] amdgpu 0000:c1:00.0: amdgpu: GPU reset succeeded, trying to resume
[ 6824.068774] [drm] PCIE GART of 512M enabled (table at 0x00000080FFD00000).
[ 6824.069000] amdgpu 0000:c1:00.0: amdgpu: SMU is resuming...
[ 6824.070786] amdgpu 0000:c1:00.0: amdgpu: SMU is resumed successfully!
[ 6824.072379] [drm] DMUB hardware initialized: version=0x08000500
[ 6824.489396] [drm] kiq ring mec 3 pipe 1 q 0
[ 6824.492726] [drm] VCN decode and encode initialized successfully(under DPG Mode).
[ 6824.492995] amdgpu 0000:c1:00.0: [drm:jpeg_v4_0_hw_init [amdgpu]] JPEG decode initialized successfully.
[ 6824.494106] amdgpu 0000:c1:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[ 6824.494109] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[ 6824.494110] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[ 6824.494111] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
[ 6824.494112] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
[ 6824.494114] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
[ 6824.494115] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
[ 6824.494116] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
[ 6824.494117] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
[ 6824.494118] amdgpu 0000:c1:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[ 6824.494119] amdgpu 0000:c1:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
[ 6824.494120] amdgpu 0000:c1:00.0: amdgpu: ring jpeg_dec uses VM inv eng 1 on hub 8
[ 6824.494121] amdgpu 0000:c1:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0
[ 6824.496972] amdgpu 0000:c1:00.0: amdgpu: recover vram bo from shadow start
[ 6824.496973] amdgpu 0000:c1:00.0: amdgpu: recover vram bo from shadow done
[ 6824.497014] [drm] Skip scheduling IBs!
[ 6824.497173] [drm] Skip scheduling IBs!
[ 6824.497181] [drm] Skip scheduling IBs!
[ 6824.497187] [drm] Skip scheduling IBs!
[ 6824.497191] [drm] Skip scheduling IBs!
[ 6824.500699] [drm] Skip scheduling IBs!
[ 6824.502666] [drm] ring gfx_32781.1.1 was added
[ 6824.505418] [drm] ring compute_32781.2.2 was added
[ 6824.507997] [drm] ring sdma_32781.3.3 was added
[ 6824.508731] [drm] Skip scheduling IBs!
[ 6824.510465] [drm] Skip scheduling IBs!
[ 6824.523719] [drm] Skip scheduling IBs!
[ 6824.523735] [drm] Skip scheduling IBs!
[ 6824.523748] [drm] Skip scheduling IBs!
[ 6824.523760] [drm] Skip scheduling IBs!
[ 6824.523770] [drm] Skip scheduling IBs!
[ 6824.523778] [drm] Skip scheduling IBs!
[ 6824.523785] [drm] Skip scheduling IBs!
[ 6824.523793] [drm] Skip scheduling IBs!
[ 6824.526411] [drm] Skip scheduling IBs!
[ 6824.526417] [drm] Skip scheduling IBs!
[ 6824.526420] [drm] Skip scheduling IBs!
[ 6824.736834] amdgpu 0000:c1:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring gfx_32781.1.1 test failed (-110)
[ 6824.864021] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6824.864136] [drm:amdgpu_mes_remove_hw_queue [amdgpu]] *ERROR* failed to remove hardware queue, queue id = 1
[ 6824.991055] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6824.991169] [drm:amdgpu_mes_remove_hw_queue [amdgpu]] *ERROR* failed to remove hardware queue, queue id = 2
[ 6825.118460] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6825.118590] [drm:amdgpu_mes_remove_hw_queue [amdgpu]] *ERROR* failed to remove hardware queue, queue id = 3
[ 6825.118994] amdgpu 0000:c1:00.0: amdgpu: GPU reset(2) succeeded!
[ 6825.282914] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=14
[ 6825.283033] [drm:amdgpu_mes_reg_write_reg_wait [amdgpu]] *ERROR* failed to reg_write_reg_wait
[ 6825.409965] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=14
[ 6825.410092] [drm:amdgpu_mes_reg_write_reg_wait [amdgpu]] *ERROR* failed to reg_write_reg_wait
[ 6826.741017] amdgpu 0000:c1:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:157 vmid:0 pasid:0, for process pid 0 thread pid 0)
[ 6826.741024] amdgpu 0000:c1:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10
[ 6826.741026] amdgpu 0000:c1:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000B3B
[ 6826.741028] amdgpu 0000:c1:00.0: amdgpu: Faulty UTCL2 client ID: CPC (0x5)
[ 6826.741030] amdgpu 0000:c1:00.0: amdgpu: MORE_FAULTS: 0x1
[ 6826.741031] amdgpu 0000:c1:00.0: amdgpu: WALKER_ERROR: 0x5
[ 6826.741032] amdgpu 0000:c1:00.0: amdgpu: PERMISSION_FAULTS: 0x3
[ 6826.741033] amdgpu 0000:c1:00.0: amdgpu: MAPPING_ERROR: 0x1
[ 6826.741034] amdgpu 0000:c1:00.0: amdgpu: RW: 0x0
[ 6826.741038] amdgpu 0000:c1:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:173 vmid:0 pasid:0, for process pid 0 thread pid 0)
[ 6826.741040] amdgpu 0000:c1:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10
[ 6826.741042] amdgpu 0000:c1:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 6826.741043] amdgpu 0000:c1:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 6826.741044] amdgpu 0000:c1:00.0: amdgpu: MORE_FAULTS: 0x0
[ 6826.741045] amdgpu 0000:c1:00.0: amdgpu: WALKER_ERROR: 0x0
[ 6826.741046] amdgpu 0000:c1:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 6826.741047] amdgpu 0000:c1:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 6826.741048] amdgpu 0000:c1:00.0: amdgpu: RW: 0x0
[ 6826.741051] amdgpu 0000:c1:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:157 vmid:0 pasid:0, for process pid 0 thread pid 0)
[ 6826.741052] amdgpu 0000:c1:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10
[ 6826.741054] amdgpu 0000:c1:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 6826.741055] amdgpu 0000:c1:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 6826.741056] amdgpu 0000:c1:00.0: amdgpu: MORE_FAULTS: 0x0
[ 6826.741057] amdgpu 0000:c1:00.0: amdgpu: WALKER_ERROR: 0x0
[ 6826.741058] amdgpu 0000:c1:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 6826.741059] amdgpu 0000:c1:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 6826.741060] amdgpu 0000:c1:00.0: amdgpu: RW: 0x0
[ 6834.651357] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=924112, emitted seq=924114
[ 6834.651481] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 1004 thread Xorg:cs0 pid 1014
[ 6834.651570] amdgpu 0000:c1:00.0: amdgpu: GPU reset begin!
[ 6834.904633] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6834.904767] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6835.033447] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6835.033602] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6835.162310] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6835.162413] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6835.291086] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6835.291194] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6835.419848] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6835.419951] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6835.548552] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6835.548676] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6835.677375] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6835.677490] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6835.806246] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6835.806454] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6835.935287] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6835.935417] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6836.142734] [drm:gfx_v11_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
[ 6836.144261] amdgpu 0000:c1:00.0: amdgpu: MODE2 reset
[ 6836.152752] amdgpu 0000:c1:00.0: amdgpu: GPU reset succeeded, trying to resume
[ 6836.153413] [drm] PCIE GART of 512M enabled (table at 0x00000080FFD00000).
[ 6836.153562] amdgpu 0000:c1:00.0: amdgpu: SMU is resuming...
[ 6836.155203] amdgpu 0000:c1:00.0: amdgpu: SMU is resumed successfully!
[ 6836.157192] [drm] DMUB hardware initialized: version=0x08000500
[ 6836.555161] [drm] kiq ring mec 3 pipe 1 q 0
[ 6836.558770] [drm] VCN decode and encode initialized successfully(under DPG Mode).
[ 6836.559400] amdgpu 0000:c1:00.0: [drm:jpeg_v4_0_hw_init [amdgpu]] JPEG decode initialized successfully.
[ 6836.561002] amdgpu 0000:c1:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[ 6836.561004] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[ 6836.561006] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[ 6836.561007] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
[ 6836.561008] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
[ 6836.561009] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
[ 6836.561011] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
[ 6836.561012] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
[ 6836.561013] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
[ 6836.561014] amdgpu 0000:c1:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[ 6836.561015] amdgpu 0000:c1:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
[ 6836.561016] amdgpu 0000:c1:00.0: amdgpu: ring jpeg_dec uses VM inv eng 1 on hub 8
[ 6836.561016] amdgpu 0000:c1:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0
[ 6836.564124] amdgpu 0000:c1:00.0: amdgpu: recover vram bo from shadow start
[ 6836.564125] amdgpu 0000:c1:00.0: amdgpu: recover vram bo from shadow done
[ 6836.564148] [drm] Skip scheduling IBs!
[ 6836.564150] [drm] Skip scheduling IBs!
[ 6836.564151] [drm] Skip scheduling IBs!
[ 6836.564153] [drm] Skip scheduling IBs!
[ 6836.564154] [drm] Skip scheduling IBs!
[ 6836.564639] [drm] Skip scheduling IBs!
[ 6836.564644] [drm] Skip scheduling IBs!
[ 6836.564648] [drm] Skip scheduling IBs!
[ 6836.564652] [drm] Skip scheduling IBs!
[ 6836.564656] [drm] Skip scheduling IBs!
[ 6836.564659] [drm] Skip scheduling IBs!
[ 6836.564673] [drm] Skip scheduling IBs!
[ 6836.564675] [drm] Skip scheduling IBs!
[ 6836.566148] [drm] Skip scheduling IBs!
[ 6836.566150] [drm] Skip scheduling IBs!
[ 6836.566151] [drm] Skip scheduling IBs!
[ 6836.566153] [drm] Skip scheduling IBs!
[ 6836.566236] [drm] Skip scheduling IBs!
[ 6836.566237] [drm] Skip scheduling IBs!
[ 6836.570498] [drm] Skip scheduling IBs!
[ 6836.570501] [drm] Skip scheduling IBs!
[ 6836.581339] [drm] ring gfx_32785.1.1 was added
[ 6836.584166] [drm] ring compute_32785.2.2 was added
[ 6836.587094] [drm] ring sdma_32785.3.3 was added
[ 6836.606411] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 6836.803565] amdgpu 0000:c1:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring gfx_32785.1.1 test failed (-110)
[ 6836.930881] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6836.930998] [drm:amdgpu_mes_remove_hw_queue [amdgpu]] *ERROR* failed to remove hardware queue, queue id = 1
[ 6837.057795] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6837.057917] [drm:amdgpu_mes_remove_hw_queue [amdgpu]] *ERROR* failed to remove hardware queue, queue id = 2
[ 6837.184953] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6837.185077] [drm:amdgpu_mes_remove_hw_queue [amdgpu]] *ERROR* failed to remove hardware queue, queue id = 3
[ 6837.185369] amdgpu 0000:c1:00.0: amdgpu: GPU reset(4) succeeded!
[ 6837.378515] Lockdown: Xorg: raw io port access is restricted; see man kernel_lockdown.7
[ 6837.582581] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=14
[ 6837.582700] [drm:amdgpu_mes_reg_write_reg_wait [amdgpu]] *ERROR* failed to reg_write_reg_wait
[ 6837.709867] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=14
[ 6837.709971] [drm:amdgpu_mes_reg_write_reg_wait [amdgpu]] *ERROR* failed to reg_write_reg_wait
[ 6838.808073] amdgpu 0000:c1:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:157 vmid:0 pasid:0, for process pid 0 thread pid 0)
[ 6838.808081] amdgpu 0000:c1:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10
[ 6838.808083] amdgpu 0000:c1:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000B3B
[ 6838.808085] amdgpu 0000:c1:00.0: amdgpu: Faulty UTCL2 client ID: CPC (0x5)
[ 6838.808087] amdgpu 0000:c1:00.0: amdgpu: MORE_FAULTS: 0x1
[ 6838.808088] amdgpu 0000:c1:00.0: amdgpu: WALKER_ERROR: 0x5
[ 6838.808089] amdgpu 0000:c1:00.0: amdgpu: PERMISSION_FAULTS: 0x3
[ 6838.808090] amdgpu 0000:c1:00.0: amdgpu: MAPPING_ERROR: 0x1
[ 6838.808091] amdgpu 0000:c1:00.0: amdgpu: RW: 0x0
[ 6838.808095] amdgpu 0000:c1:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:173 vmid:0 pasid:0, for process pid 0 thread pid 0)
[ 6838.808097] amdgpu 0000:c1:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10
[ 6838.808099] amdgpu 0000:c1:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 6838.808100] amdgpu 0000:c1:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 6838.808101] amdgpu 0000:c1:00.0: amdgpu: MORE_FAULTS: 0x0
[ 6838.808102] amdgpu 0000:c1:00.0: amdgpu: WALKER_ERROR: 0x0
[ 6838.808103] amdgpu 0000:c1:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 6838.808104] amdgpu 0000:c1:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 6838.808105] amdgpu 0000:c1:00.0: amdgpu: RW: 0x0
[ 6838.808108] amdgpu 0000:c1:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:157 vmid:0 pasid:0, for process pid 0 thread pid 0)
[ 6838.808110] amdgpu 0000:c1:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10
[ 6838.808111] amdgpu 0000:c1:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 6838.808112] amdgpu 0000:c1:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 6838.808113] amdgpu 0000:c1:00.0: amdgpu: MORE_FAULTS: 0x0
[ 6838.808114] amdgpu 0000:c1:00.0: amdgpu: WALKER_ERROR: 0x0
[ 6838.808115] amdgpu 0000:c1:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 6838.808116] amdgpu 0000:c1:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 6838.808117] amdgpu 0000:c1:00.0: amdgpu: RW: 0x0
[ 6846.683477] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=924179, emitted seq=924182
[ 6846.683601] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0
[ 6846.683690] amdgpu 0000:c1:00.0: amdgpu: GPU reset begin!
[ 6846.817093] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6846.817215] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6846.946120] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6846.946217] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6847.075152] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6847.075247] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6847.204194] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6847.204287] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6847.333267] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6847.333378] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6847.462354] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6847.462447] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6847.591427] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6847.591531] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6847.720514] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6847.720623] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6847.849605] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[ 6847.849712] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 6848.056887] [drm:gfx_v11_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
[ 6848.058379] amdgpu 0000:c1:00.0: amdgpu: MODE2 reset
[ 6848.068167] amdgpu 0000:c1:00.0: amdgpu: GPU reset succeeded, trying to resume
[ 6848.068852] [drm] PCIE GART of 512M enabled (table at 0x00000080FFD00000).
[ 6848.068891] amdgpu 0000:c1:00.0: amdgpu: SMU is resuming...
[ 6848.070465] amdgpu 0000:c1:00.0: amdgpu: SMU is resumed successfully!
[ 6848.072455] [drm] DMUB hardware initialized: version=0x08000500
[ 6848.082862] [drm] kiq ring mec 3 pipe 1 q 0
[ 6848.084491] [drm] VCN decode and encode initialized successfully(under DPG Mode).
[ 6848.084595] amdgpu 0000:c1:00.0: [drm:jpeg_v4_0_hw_init [amdgpu]] JPEG decode initialized successfully.
[ 6848.085040] amdgpu 0000:c1:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[ 6848.085042] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[ 6848.085043] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[ 6848.085044] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
[ 6848.085045] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
[ 6848.085046] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
[ 6848.085047] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
[ 6848.085048] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
[ 6848.085049] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
[ 6848.085050] amdgpu 0000:c1:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[ 6848.085051] amdgpu 0000:c1:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
[ 6848.085052] amdgpu 0000:c1:00.0: amdgpu: ring jpeg_dec uses VM inv eng 1 on hub 8
[ 6848.085053] amdgpu 0000:c1:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0
[ 6848.087108] amdgpu 0000:c1:00.0: amdgpu: recover vram bo from shadow start
[ 6848.087109] amdgpu 0000:c1:00.0: amdgpu: recover vram bo from shadow done
[ 6848.090732] [drm] ring gfx_32770.1.1 was added
[ 6848.091552] [drm] ring compute_32770.2.2 was added
[ 6848.092442] [drm] ring sdma_32770.3.3 was added
[ 6848.092463] [drm] ring gfx_32770.1.1 ib test pass
[ 6848.092482] [drm] ring compute_32770.2.2 ib test pass
[ 6848.092717] [drm] ring sdma_32770.3.3 ib test pass
[ 6848.093771] amdgpu 0000:c1:00.0: amdgpu: GPU reset(6) succeeded!
[ 6858.450096] [drm:amdgpu_dm_process_dmub_aux_transfer_sync [amdgpu]] *ERROR* wait_for_completion_timeout timeout!
[ 6868.689756] [drm:amdgpu_dm_process_dmub_aux_transfer_sync [amdgpu]] *ERROR* wait_for_completion_timeout timeout!
[ 6878.929987] [drm:amdgpu_dm_process_dmub_aux_transfer_sync [amdgpu]] *ERROR* wait_for_completion_timeout timeout!
[ 6889.169988] [drm:amdgpu_dm_process_dmub_aux_transfer_sync [amdgpu]] *ERROR* wait_for_completion_timeout timeout!
[ 6899.409649] [drm:amdgpu_dm_process_dmub_aux_transfer_sync [amdgpu]] *ERROR* wait_for_completion_timeout timeout!
[ 6909.649396] [drm:amdgpu_dm_process_dmub_aux_transfer_sync [amdgpu]] *ERROR* wait_for_completion_timeout timeout!
[ 6919.889277] [drm:amdgpu_dm_process_dmub_aux_transfer_sync [amdgpu]] *ERROR* wait_for_completion_timeout timeout!
[ 6930.129331] [drm:amdgpu_dm_process_dmub_aux_transfer_sync [amdgpu]] *ERROR* wait_for_completion_timeout timeout!
[ 6940.369606] [drm:amdgpu_dm_process_dmub_aux_transfer_sync [amdgpu]] *ERROR* wait_for_completion_timeout timeout!