[TRACKING] High Battery Drain During Suspend

Ok so I made more tests, I can still reproduce the problem (not always but sometimes after 2-3 attempts). I tried checking dmesg to see if the laptop was changing state (leaving s2idle or something like that) but unfortunately no, when the high consumption sleep state is reached, nothing different appears in dmesg, I see the normal s2idle state on sleep and the resume line when I resume (also the power led is always pulsing as usual for a sleep state). Another thing I tried is to measure the consumption (at the outlet) when this happens. To achieve this I put the laptop to sleep but plugged in like before, which gets me about 1w consumption when all is well, then I try unplugging, wait a couple minutes, plug again, wait for the top up charge if any and then read the consumption. In the last try I made I got 1w at the first sleep, 1w after the unplug/replug, and the third time 7w !, the laptop is now sleeping (pulsing power LED) but warm and has a 7w consumption as opposed to the expected 1w. This is not the top-up charge because itā€™s been stable for hours, not decreasing any more. I did this test twice (resuming, make sure itā€™s fully charge, then sleep again and unplug/plug a couple times until I get something else than 1w and a warm laptop), and reroduced both times after just 2 or 3 unplugging. The LED on the side near the power plug is still blinking orange like when I put the laptop to sleep. Though when I unplug and replug as the battery is full the first time, it usually becomes white, it goes blinking orange when I wait a bit more and replug once the battery is no longer ā€œfullā€.
Iā€™m not sure what else I can do to debug this problem, I donā€™t have the HDMI apdater which is know to cause some issues, only 2 USB-C, 1 USB-A, and one microSD. Maybe Iā€™ll try unplugging those but I donā€™t have much hope.

Edit: while the laptop was sleeping in this 7w consumption status (7.1w to be precise) I just tried hot-unplugging the microSD and USB-A ports, and noticed a drop of about 0.4w for the USB-A and 0.2w for the microSD (which means they do have a visible impact here although itā€™s not causing the whole extra 6w). After unplugging both I am down to 6.5w average. Unplugging the USB-C doesnā€™t have any impact because itā€™s passive I suppose. BUT THEN if I plug them back, I get +0.9w for the microSD and +0.4w for the USB-A, so now Iā€™m up to 7.9w consumption now. Pretty weird an not consistent, I suppose the plugging/unplugging wake up some internal components which are now going to consume more?

For the record I tried doing this again when the sleep is working fine (1w) and got -0.05w for the microSD and -0.35w for the USB-A. Removing both I am down to 0.6w consumption in s2idle which is great actually, so a bit sad to learn that having an USB-A port is reducing sleep duration by 35% in my case. Plugging the USB-A back brings back the expected +0.35w and plugging back the microSD actually brings back +0.75w ! so Iā€™m up to 1.75w now :confused:

Iā€™m sure this is tricky business to handle power state of all these devices espcially if you can hot-swap them during sleep, and honestly that is understandable to me I donā€™t care much if doing this is increasing the consumption. What I care though is the power consumption begin sometimes 1w and sometimes 7w without touching anything else than the power cord. Effectively making sleep unusable because you donā€™t know if itā€™s gonna drain your battery and heat up a lot in your bag.

Card consumption when idle (not sleeping):

  • USB-A: +0.45w
  • Micro-SD reader: +1.2w
    Unplugging both card gets me down from 6.4w to 4.75w idle, thatā€™s 25% battery life gain!
  • Also I measured HDMI card at +0.35w (Iā€™m not using this one by default)
6 Likes

Do you want to risk a BIOS upgrade to 3.08?

I updated already after I found this:

In my case, it works flawless.

Some people seem to still have issues with version 3.08.

( The link to the new BIOS version is the same as for 3.07, but with 8 instead of 7. )

3 Likes

@dma well it doesnā€™t seem likely to help considering what they said but for lack of better option or official response Iā€™ll give it a try yes. I just installed 3.08 and Iā€™ll run some sleep sessions and report if I see any change.

1 Like

FYI 3.08 fixes drain while shut down, not during suspend.

I just found this news about a patch from Intel related to energy consumption in my RRS today

I have not been able to follow up on all the details, but Iā€™m wondering if this might be related?

1 Like

@feesh yes thatā€™s why I said it doesnā€™t seem likely but thanks for adding this. And as expected, 3.08 did not help with this problem, I reproduce easily after just 1 unplug.

@dfh nice finding, it could be yes. Weā€™ll have to wait and track in which kernel version this patch lands I suppose.

Yeah sadly I donā€™t have the time and setup to quickly test the patch. I hope someone can make sense of the change and if it impacts FrameWorks.
And hopefully be able to report back :+1:

1 Like

Small update after being in a situation where Iā€™m actually relying on the battery quite a bit.

The nvme.noacpi=1 option indeed makes a huge difference.

With Fedora 35, Linux 5.17.4-200.fc35.x86_64

Iā€™m now seeing:
s2idle with HDMI and USB-A inserted: 1W
s2idle with just USB-C cards inserted: 0.34W

ā€œidleā€ use (reading something on the screen, with rather low screen brightness): about 4W
ā€œscreen lockedā€ use (screen off): around 2W

So Iā€™d say the nvme.noacpi=1 has almost completely resolved the power drain: even with 1W, the machine can stay suspended for more than 2 days and without the HDMI and USB-A cards itā€™s an entirely decent score.

A consequence is that the extra power use of these cards really sticks out as a sore thumb. It would be great if some kind of switch would be available to turn off the drain on these, even if it would require replugging them to make them functional again.

At the very least I think Framework should advertise that these expansion card can affect the power profile, even if theyā€™re not used. It would have made me buy two extra USB-C cards so that I can place the machine in a power-frugal setup without having gaping holes in the bottom ā€¦

I have also seen the kernel in such a state that it was still using about 4W while suspended. I suspect that was due to a crashed R8153 ethernet adapter driver (that ethernet chip seems to be particularly mercurial with USB-C setups in linux; hopefully the workarounds improve a bit in the next couple of kernels). With that power use the laptop feels warm to the touch after spending some time in a laptop sleeve/bag. After rebooting this has not reoccurred, but it does provide a good motivation for monitoring the drain a bit.

3 Likes

Thatā€™s what I had to do as well. Using 4 usb-c cards, and canā€™t remember when I last swapped them outā€¦because now, the USB-A and HDMI cards are just acting as dongles in my use casesā€¦when/if I need to use them.

This makes the ā€˜swappableā€™ use case relatively more niche than before.

@Nils thanks for your update which confirms most of what I saw. I agree the impact on battery life should be at least specified on adapters.

What makes you point the R8153 ethernet adapter? did you find any way to poinpoint the reason for the incorrect suspended state? Also if thatā€™s the same cause for me I donā€™t need to reboot, simply waking up the laptop and putting it back to sleep usually fixes the problem. Every time I put it to sleep I have to check the temperature 30 minutes after to be sure itā€™s actually sleeping correctlyā€¦ Not very cool.

Only circumstance made me blame it. I did not do an analysis for the suspend states hit by the various components and I donā€™t have a real lead that a defunct R8153 driver was causing the problem. The R8153 did fail before with a

r8152 ā€¦ ā€¦: Tx status -71

message and an ā€œOopsā€ traceback afterwards. The Belkin Multimedia USB-C hub it is part of would not show the network interface afterwards anymore: not after uplugging and replugging and also not after suspending and waking (and various combinations of the two).

After that the laptop would get warm when suspended and put in a bag and showed ~4W power use (no peripherals other than the usual expansion cards inserted); consistently.

Both indicate that the kernel is in a bad state. So I made a leap and assumed the two are correlated.

Rebooting fixes the R8153 problems (until the next crash when in use) and fixes the bad suspend as well.

Oddly enough, I have a USB-C monitor that also has an R8153 network interface built-in and that one causes no problems. Iā€™ve used the Belkin hub successfully with other (non-unix) devices, so itā€™s either the network that makes the hub act up or itā€™s the way itā€™s wired in the hub that is difficult for the linux driver (unlike most R8153 problem reports, the error occurs when the network interface is in use, so likely not due to power saving issues)

Ok thanks, but when in a bag I suppose the Belkin hub is not plugged-in ? It was plugged before though I suppose. As some other people are seeing the same high power sleep bug I suppose itā€™s unlikely to be related to this external interface. But who knows :slight_smile:

For comparisonā€™s sake, what is the best-in-class power consumption using s3 and s2idle with any linux laptop that exists? The latest from Lenovo seems terrible too. So perhaps we need to pressure AMD/Intel to work on this with the kernel devs - but they must have already noticed, right? Considering it seems the best hardware sleeping with Linux today isnā€™t anywhere close to the power saving available on over decade old macbooks.

2 Likes

As I suppose weā€™re not gonna be able to motivate anybody from the team to really look into this (not enough people complaining), I am asking myself one thing: if my framework is burning hot for hours in my bag because of this bug and it causes permanent damages, will the waranty work? will I be able to get a free repair/replacement?

3 Likes

Same worrying from my side: every time I close my laptop and then I end up opening it on the next day, battery is completely depleted, this canā€™t be good for the battery.
Battery on my laptop last about 3 hours for me currently, so I guess mine is already quite damaged, although it never performed much better than 5 hours to me I think.

I wish Framework team would acknowledge this issue, and at least release an official note/recommendations on what to do, or how to properly go about this in the mean while.

I miss the days I was able to go out without the laptop charger. Now I know that if my framework was left unplugged, probably has no more battery by the time I want to use it.
Very frustrating, to say the least.

3 Likes

Same feeling here.

Iā€™m worried when I suspend my laptop, put it in my backpack and 1h later find it almost burning my hands as if it wasnā€™t really suspended. Then I was really frustrated when I was forgetting to plug the laptop overnight.

I say ā€œwasā€ because for the first time in 20 years using Linux, Iā€™ve turned on hybrid sleep on a laptop. When unplugged, I suspend the laptop for 2 hours and then it hibernates.

It takes more time to resume since it has to boot but at least Iā€™m not on edge anymore.

2 Likes

I think that if it gets that hot, it isnā€™t properly suspended. And indeed, given that Lithium-ion batteries are apparently quite temperature-sensitive, also not good for the longevity of the product. At those power levels, it should be able to ventilate well.

There seem to be frustratingly many things that can lead to suspend state not be properly entered, or lead to high power use. Iā€™ve seen on this forum:

  • nvme ssd drives use excessive power during suspend (nvme.noacpi=1 seems to help in those cases)
  • higher power use due to a previously crashed kernel module
  • tricky setups with wake-up triggers, making the system wake up almost immediately from suspend (and hence not being suspended when placed in the bag)

I havenā€™t seen suspend problems in my setup lately, but some unexpected update could unfortunately change that. The current power usage means that on a full battery, the system can easily survive for 2 days suspended; going up to about a week with USB-A and HDMI removed. Itā€™s not great, but workable if otherwise your laptop has a habit of being plugged in overnight.

To some extent, the power problems seem to stem from ā€œmodern suspendā€ s2idle, which makes for very responsive systems, but in turn for rather high power usage and apparently, because the system is hardly asleep at all, a state that is very sensitive to other factors to change its power use. I donā€™t think Framework can be quite blamed for that and given the possible causes for increased power use, it looks difficult to give widely applicable advice/tips.

They should warn people that expansion cards that are not just USB-C cause significant power use even during suspend, though!

4 Likes

Hello, I was playing around with turbostat and the s0ix debugging tools provided on 01.org and noticed some failures when testing s2idle.

[  880.618452] PM: Suspending system (s2idle)
[  880.618455] printk: Suspending console(s) (use no_console_suspend to debug)
[  880.619538] wlp170s0: deauthenticating from 94:83:c4:1f:4d:62 by local choice (Reason: 3=DEAUTH_LEAVING)
[  881.200712] PM: suspend of devices complete after 581.211 msecs
[  881.200717] PM: start suspend of devices complete after 582.169 msecs
[  881.200720] PM: suspend devices took 0.582 seconds
[  881.215569] PM: late suspend of devices complete after 14.843 msecs
[  881.241747] ACPI: EC: interrupt blocked
[  881.308356] PM: noirq suspend of devices complete after 92.029 msecs
[  881.308393] ACPI: \_SB_.PR00: LPI: Device not power manageable
[  881.308398] ACPI: \_SB_.PR01: LPI: Device not power manageable
[  881.308400] ACPI: \_SB_.PR02: LPI: Device not power manageable
[  881.308402] ACPI: \_SB_.PR03: LPI: Device not power manageable
[  881.308403] ACPI: \_SB_.PR04: LPI: Device not power manageable
[  881.308405] ACPI: \_SB_.PR05: LPI: Device not power manageable
[  881.308406] ACPI: \_SB_.PR06: LPI: Device not power manageable
[  881.308407] ACPI: \_SB_.PR07: LPI: Device not power manageable
[  881.308413] ACPI: \_SB_.PC00.RP10.PXSX: LPI: Device not power manageable
[  881.308415] ACPI: \_SB_.PC00.HECI: LPI: Device not power manageable
[  881.308417] ACPI: \_SB_.PC00.PEG0.PEGP: LPI: Constraint not met; min power state:D3hot current power state:D0
[  881.308422] ACPI: \_SB_.PC00.GNA0: LPI: Device not power manageable
[  881.310108] PM: suspend-to-idle
[  893.693629] Timekeeping suspended for 11.610 seconds
[  893.693860] ACPI: PM: Wakeup unrelated to ACPI SCI
[  893.693863] PM: resume from suspend-to-idle
[  893.696003] ACPI: EC: interrupt unblocked
[  894.122699] PM: noirq resume of devices complete after 426.935 msecs
[  894.126365] PM: early resume of devices complete after 3.551 msecs
[  894.642587] PM: resume of devices complete after 516.134 msecs
[  894.651955] PM: resume devices took 0.525 seconds
[  894.651973] PM: Finishing wakeup.
[  894.651975] OOM killer enabled.
[  894.651976] Restarting tasks ... 

I am curious if these \_SB_.PR0x power management failures mean anything to anyone?
I am guessing these are platform features that are not suspending when the machine is put to sleep causing the suspend power drain.

1 Like

This is kind of a joke. 1260p, hybrid sleep disabled, 2x USB C, 2x USB A

Why am I wasting hours of my life trying to diagnose this on such an expensive product? How is this acceptable on an enthusiast oriented machine?

4 Likes

Yupā€¦Iā€™ve been there with the frustration. All very good questions.

All I can say is: Some people here on the forum confused ā€œenthusiast oriented machineā€ with ā€œwanting to tinkerā€. In my book, the two are not one and the same.