When I close my laptop lid when it’s plugged in and leave the laptop for an hour, when I return, the laptop CPU is at 70 degrees. It seems a kernel worker thread (kworker/0:1-events) was taking 78% of the CPU when I resumed. An AMD GPU error may have prevented sleep, or, the system was also woken up repeatedly by UCSI errors.
Here’s a selection of errors:
1. AMD GPU SMU Communication Failures:
[18:42:45] kernel: WARNING: CPU: 0 PID: 16439 at drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn35/dcn35_smu.c:143 dcn35_smu_send_msg_with_param+0x164/0x1d0 [amdgpu]
[18:42:45] kernel: amdgpu 0000:c1:00.0: [drm] SMU response after wait: -1, msg id = 25
[18:42:45] kernel: amdgpu 0000:c1:00.0: [drm] SMU response after wait: -1, msg id = 25
[18:42:45] kernel: amd_pmc AMDI000A:00: Last suspend didn't reach deepest state
2. Failed Suspend Sequence:
[13171.148693] PM: suspend entry (s2idle)
[13171.159283] Filesystems sync: 0.010 seconds
[13171.161983] Freezing user space processes
[13171.164473] Freezing user space processes completed (elapsed 0.002 seconds)
[13171.164481] OOM killer disabled.
[13171.164482] Freezing remaining freezable tasks
[13171.165618] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[13171.329904] ACPI: EC: interrupt blocked
[13256.524160] amdgpu 0000:c1:00.0: [drm] SMU response after wait: -1, msg id = 25
3. USB-C/UCSI Errors (sample from multiple occurrences):
[20:10:18] kernel: typec port3-partner: PM: parent port3 should not be sleeping
[20:10:18] kernel: ucsi_acpi USBC000:00: unknown error 0
[20:10:18] kernel: ucsi_acpi USBC000:00: UCSI_GET_PDOS failed (-5)
[20:10:33] kernel: ucsi_acpi USBC000:00: possible UCSI driver bug 1
[20:10:33] kernel: ucsi_acpi USBC000:00: ucsi_handle_connector_change: GET_CONNECTOR_STATUS failed (-22)
4. Power Management Issues:
[21:36:19] kernel: ACPI: EC: interrupt blocked
[21:36:19] kernel: ucsi_acpi USBC000:00: ucsi_handle_connector_change: GET_CONNECTOR_STATUS failed (-110)
[21:36:19] kernel: ACPI: EC: interrupt unblocked
[21:36:19] kernel: ucsi_acpi USBC000:00: possible UCSI driver bug 1
[21:36:19] kernel: ucsi_acpi USBC000:00: failed to re-enable notifications (-22)
All when the laptop lid was closed, and the laptop was plugged in.
I also tried amd-s2idle debugging tool. The MD output didn’t include the errors, so here’s a pastebin of the HTML output: amd sleep errors - Pastebin.com
It notes that the machine wasn’t able to reach deepest sleep state, and shows similar ucsi issues.
My belief of what’s happening:
Failed Suspend Attempts:
• The system attempted to enter s2idle (suspend-to-idle) state multiple times but failed to reach deep sleep
• At 18:42:45: “amd_pmc AMDI000A:00: Last suspend didn’t reach deepest state”
• The laptop kept cycling between attempted suspends and failed wake-ups
Key Hardware Issues Preventing Proper Sleep:
• AMD GPU SMU (System Management Unit) communication failures
• UCSI (USB Type-C Connector System Software Interface) errors continuously occurring
• Kernel worker thread (kworker/0:1-events) stuck at 87.6% CPU usage
Timeline of Events:
• Multiple suspend attempts at:
• 18:42:42 - First attempt
• 18:43:13 - Second attempt
• 20:05:16 - Third attempt
• 20:10:45 - Fourth attempt
• Each time, the system failed to properly enter deep sleep state
Thanks for the tips. I’d like to apply the mentioned patches, just to check, that sounds like a kpatch thing? So, download the manjaro version of the kernel matching my current version, apply the patch, stick into kpatch ?
I just looked and the important patches were backported to 6.12.25 or earlier, so you should have them. This might be a genuinely new error or caused by behavior of your compositor.
I didn’t note a compositor in the report, what compositor are you using?
Oh, thanks for checking that, I was deep into trying to figure out how to patch so that’ll save me a lot of trouble if so. I did upgrade to 6.12.28-1 from, can’t remember which, but possibly before 6.12.15, however I’m still getting issues with sleep.
I’m using Sway (wayland based i3), I’m not sure the compositor when in wayland world, I had to switch to sway because i3 was randomly crashing whether or not I had picom enabled as a compositor. But the fact that I was having so many issues with i3, and the fact that you’re mentioning compositors now, makes me feel like that’s something worth re-investigating (I just copied over my i3 config to a sway equivalent and wrote it off for now so I could get back to work).
Do you have any tips on how I can investigate this further?
TBH - this is quite odd, everything looks configured correctly in the report but you’re getting NMI’s and the SMU isn’t responding.
You haven’t done any modification to the BIOS using any reverse engineered tools have you? Would it be possible to double check a newer kernel like 6.14 or 6.15-rc6 to see if the same issue happens?
Could you double check with a Fedora 42 live image if it’s happening?
I’m going to try with a newer kernel on manjaro when I get a chance. I tried with a Fedora live image and I was very sad about how nice this laptop ran on that… the temptation to pick up a new distro is strong but I shouldn’t spend time on that
However I’m in an airport so I wasn’t able to plug in which I believe is a major source of issues. I did at least plug a battery into the usb c port which was triggering some errors about cable properties but it’s not exactly the same. Anyway, the machine appears to go into sleep mode without issue on fedora.