I’ve spent most of today on this, and I think I’ve narrowed it down as far as I can from the Linux side without flashing an instrumented EC(pritnf goes brrrr).
Machine:
- Framework Laptop 16
- AMD Ryzen 9 7940HS
- BIOS
4.03 - keyboard module firmware
0.31 - NixOS
25.11 - mainly tested on kernel
6.19.5
Symptom:
- suspend-to-idle begins, then wakes almost immediately
amd_pmcreports it didn’t reach the deepest stateamd_s2idle.pyreports0.00%hardware sleep
The recurring kernel signature is:
PM: suspend-to-idle
PM: Triggering wakeup from IRQ 9
ACPI: PM: ACPI fixed event wakeup
amd_pmc ... Last suspend didn't reach deepest state
and the report summary keeps ending up as:
Hardware Sleep: 0.00%Wake Interrupt: ACPI SCI
I ruled out a lot of obvious stuff already:
- removed custom
amdgpu.ppfeaturemask - blacklisted out-of-tree
framework_laptop - tested on AC and battery
- disabled USB / USB4 wake sources
- unloaded
ucsi_acpi - unloaded Goodix fingerprint driver
- unloaded Wi-Fi drivers
- unloaded touchpad HID stack
- physically removed keyboard + touchpad/input deck for a test
Those changed some side symptoms, but not the core failure. The machine still failed to reach hardware sleep and still woke through ACPI SCI.
I also checked Framework’s EC repo because at this point it looked more like firmware / EC / ACPI behavior than a normal Linux wake-source problem.
Current EC branch that seems relevant for FW16 AMD:
fwk-tulip-29169
I found one thing that looks like a straight logic bug, plus a couple more things that look suspicious but I can’t prove on my machine yet.
1. Likely EC bug: GPU poller appears to run forever, including during suspend/off
This condition in lotus/src/gpu.c looks wrong:
It does:
if (!chipset_in_state(CHIPSET_STATE_ANY_SUSPEND) ||
!chipset_in_state(CHIPSET_STATE_ANY_OFF))
The state groups are disjoint here:
So !suspend || !off is effectively always true for normal chipset states. That means the deferred GPU poll keeps rescheduling every 10 ms even while suspended/off. I can’t prove this is the whole root cause of the Linux-side “never enters hardware sleep” behavior, but I don’t see how this condition is correct as written.
2. Suspicious: S0ix enter/resume flag handling may bounce the EC out of suspend
These paths look suspicious:
- EmbeddedController/zephyr/program/framework/lotus/src/power_sequence.c at 6f1d9c013571f84b03599d6318e6adab295fa502 · FrameworkComputer/EmbeddedController · GitHub
- EmbeddedController/zephyr/program/framework/lotus/src/power_sequence.c at 6f1d9c013571f84b03599d6318e6adab295fa502 · FrameworkComputer/EmbeddedController · GitHub
The function named check_s0ix_statsus() latches enter/resume bits, clears them immediately, and prioritizes resume over enter. That feels like a plausible way to bounce out of S0ix if the flags are noisy/stale. But I don’t have enough runtime EC evidence to call this a proven bug yet.
3. Probably policy, but still relevant: suspend wake mask is broad
The S0ix wake mask still allows:
- lid open / close
- power button
- AC connected / disconnected
- battery
- battery trip point
That may be intentional policy, so I am not calling it a bug by itself. I am mentioning it because with the observed wake pattern, it is one of the few remaining EC-side paths that still looks relevant.
What I am asking
- Is this already a known FW16 AMD / BIOS
4.03suspend issue? - Has Framework already fixed this in a newer EC / BIOS branch that is not broadly released yet?
- Does the EC team agree that the
gpu.ccondition above is a bug (or am I missing some subtlety about those state groups)? - If not, what would you want captured next to make this actionable without more guesswork?
If helpful I can also attach:
amd_s2idle.pyreports- focused kernel logs
- ACPI wake snapshots
- a longer elimination log of everything I already tested
I can patch ec myself, but it’s kinda scary to brick it
And also it would be nice if sleep was just normally working.
I had same problem with sleep eating 30% per night before on fedore, but this time i’ve digged a bit