[RESPONDED] Linux s2idle sleep "random" power usage increase

Hi,

After writting extensively in [TRACKING] High Battery Drain During Suspend - #70 by Kimo_Bonnelycke about my similar but slightly different problem, I though I would create another thread to separate the two because the reason and fix is likely quite different.

The problem I’m talking about here is about the Framework laptop (11th gen at least), when going to s2idle sleep under Linux (at least), sometimes draw a “normal” 0.8w + epansion cards (1.5w for example with 2 USB-A), and SOMETIMES draw a much higher 3-7w, leading to the laptop getting very hot (especially when stored in a backpack) and of course the battery draining in a couple of hours.

I run Ubuntu 22.04 LTS, read the couple posts about battery optimisation already, I installed powertop, tlp, etc… Every tunable is “good” in powertop, I get a decent 5-6w idle with the laptop running, and when sleeping (s2idle) I measure a consumption of about 0.8w baseline (without expansion cards) at the USB-C plug, using an USB-C power meter while the battery is full.

In my first tests when it goes well, I let the laptop sleep for 3:30 hours, and the plug still reported about 1W consumption average (3.6Wh after 3h30min, with one USB-A card). This looks ok to me, it would mean the laptop can last about 55h in sleep. But unforunately, after I unpluged the laptop and resumed after 16h, the battery was almost empty, 48Wh had been consumed according to the laptop power statistics, this averages at 3W compared the 1W I saw earlier. Also the laptop was a bit warm (which makes sense as it has 3 times more heat to dissipate). I reproduced this problem twice back then (the first time I didn’t measure anything, I was just surprised).

I tried checking dmesg to see if the laptop was changing state (leaving s2idle or something like that) but unfortunately no, when the high consumption sleep state is reached, nothing different appears in dmesg, I see the normal s2idle state on sleep and the resume line when I resume, nothing else in between.

Another thing I tried is to measure the consumption (at the outlet) when this happens. To achieve this I put the laptop to sleep but plugged in like before, which gets me about 1w consumption when all is well, then I try unplugging, wait a couple minutes, plug again, wait for the top up charge if any and then read the consumption. In this try I made I got 1w at the first sleep, 1w after the unplug/replug, and the third time 7w !, the laptop is now sleeping (pulsing power LED) but quite warm and has a 7w consumption as opposed to the expected 1w. This is not the top-up charge because it’s been stable for hours. I did this test twice (resuming, make sure it’s fully charged, then sleep again and unplug/plug a couple times until I get something else than 1w and a warm laptop), and reproduced both times after just 2 or 3 unplugging (this takes time though, I have to let the laptop unplugged for hours sometimes, just unplugging/repluging quickly does not seem to reproduce).

I installed bios update 3.08 just in case as there was a fix about power management but unrelated, by lack of better options. As expected, 3.08 did not help with this problem, I reproduced easily after just 1 unplug (got lucky this time).

Somebody suggested https://www.phoronix.com/news/Intel-S0ix-Linux-Failure-Hot which does sound like a possible cause, but I couldn’t manage to track in which kernel version this patch landed, it’s been quite some time already, I am up to date running 5.15.0-58 at the moment and can still reproduce.

A couple more users on [TRACKING] High Battery Drain During Suspend - #70 by Kimo_Bonnelycke confirmed they did see the same problem as me, which is good because first it means I’m not crazy, secondly it means it’s likely not an isolated hardware/software problem and might be reproducible by the Framework team.

Lately I tried using the option in the bios to limit the battery charge to 80% or 90%, to see if it would make any difference. I still reproduced the problem so it’s not “fixing” it. Though with the limit at 100% it’s even longer to measure because when we plug it in, it takes a long time for the battery to top-up until we can see the residual consumption. Whereas when limited to 80% it charges quickly and then stops so we can see more quickly if we’re stuck at 1w or 7w.

A couple more things I’ve noticed which may be interesting:

  • Unplugging the laptop seems to be enough to (randomly) start the problem (go from 1w to 7w), and the easiest way I found to reproduce yet. But we must not replug too quickly though, so measuring is hard.
  • Putting the laptop to sleep without touching the power cable can also start the problem.
  • On the other hand, once we’re at 7w, unplugging/plugging does not help apparently, we can’t go back to 1w this way it keeps at 7w.
  • Resuming and resleeping the laptop does fix the problem and allows it to go back to 1w
  • When resumed, the laptop actually consume a little bit LESS power than while in this “weird” state (~6w idle, down to ~4w when the screen is off)
  • The power button LED is still pulsing white as usual when I put the laptop to sleep, no matter if it’s consuming 1w or 7w.
  • Tried with and without the USB-A cards, I can still reproduce without, though it seems harder, maybe its less likely or takes more time because of lower baseline consumption.

Hopefully I’ll be able to find a reproducible way, if that’s the case I’ll record a video.

In order to diagnose what part of laptop is draining the bogus extra 6w, I even went deeper down the rabbit hole and shot some thermal images of the motherboard using my cheap FLIR thermal camera. I normalized the thermal scale on most of them images so it’s easier to compare:

  1. Framework sleeping normally (s2idle, with only USB-C) at ~0.8w
  2. Framework sleeping with power issue at ~6.2w baseline

  3. Framework running idle at ~5w

It would be great is somebody from the team could acknowledge this problem and provide some leads as to what could cause this problem what can we do to help you fix it?

7 Likes

I want to start this by saying I’m sorry that you’re having this issue, and I think it’s fair that you’re unhappy with the battery drain issues that you’re having

I don’t think you mentioned in this post or any of your posts in the other thread (if I missed it I’m sorry) but I’m going to assume that you have an 11th gen laptop as the newest beta BIOS I’ve seen for 12th gen was 3.06 (I have an 11th gen, so I may be wrong about this too). 3.08 was never officially released, and 3.09 was the official release with those battery fixes. 3.10 is the newest released 11th gen BIOS, have you tried that?

Seeing your thermal camera photos seems very convincing, and I hope that someone on the Framework team sees this post, but I also am curious if you’ve contacted support at all about your issues? The forum is meant to be a user forum, and it’s possible that the framework team either hasn’t seen the thread (could be unlikely as its a very long thread and is from a long time ago) but bringing these concerns directly to support is the best way to make absolutely certain the company is aware of the issue.

No matter what, I hope that this issue can be resolved in some way that is good for all the users impacted.

1 Like

Yes indeed I forget to mention it’s the 11th gen with the i7-1165G7. I also have a 2TB Samsung 980 Pro SSD, and 2 × Crucial RAM 16Go DDR4 3200MHz.

About the bios 3.10, I just discovered about it now so no I haven’t tried that yet. I surely will but haven’t seen any new fixes related to this problem so I have little hope.

I haven’t contacted the support yet but I will, I wanted to put this here first because I know there are other people impacted but problems were mixing up (some of it because of me) in the previous thread and so I though one dedicated public thread would be better to gather all the data we can on this issue.

2 Likes

Hi,

I was until recently having a similar experience running Fedora 37 on 11th gen.

I also have a 980 Pro but 1TB and after your pictures noticed my memory controller was also heating up when this happens.

I updated the firmware (3B2QGXA7 to 5B2QGXA7) and the issue seems fixed for me.

Edit: It happened again so not fixed.

There is a reported firmware bug for specifically the 2TB 980 Pro on 3B2QGXA7 as a separate issue so if you haven’t already updated would seem to be a good idea.

@Usernames thanks for this suggestion which could have been helpful indeed as I did not check recently for any Samsung firmware updates. Though as you said it happened again and in my case I was already on the latest firmware (5B2QGXA7). The last time I updated was early last year when I was setting up my Framework so it has been running this version all this time.

Which bios version are you running? I’ve just updated from 3.08 to 3.10 as @Azure suggested, without much hope ^^. I’m currently running some tests again to see if it changes anything.

1 Like

Thanks for the tip I’m already on 3.10 unfortunately.

I’m not sure but the HDMI expansion seems to potentially be a trigger whether previously connected in the seasion or currently connected.

I am seeing a potentially separate issue where the HDMI fails to get enumerate which itself seems to causes high powered on draw until resolved but perhaps they are related issues.

Another thing I may have noticed (need to check next time) is that the high drain didn’t start immediately but after about and hour of sleep.

Have you noticed anything like this?

Does this correlate with what you observe?

Does the probelem still happen with only type C cards?

Ok thanks, and on my end I reproduced the problem with 3.10 also.

In my case I don’t have the HDMI adapter and I reproduced with only USB-C cards.

Yes as I said in my first post, I can’t trigger the problem if I just unplug/plug rapidly, I need to wait for it to start. I always need multiple tries and wait between 15m and 12h. Somtimes it did trigger in less than an hour, but most of time I believe it’s in the couple of hours range.

1 Like

Quick update a new test I have been running the past week: I’ve tried reproducing the problem using an Ubuntu Live USB (22.04.1 LTS running 5.15.0-43). I used the “toram” kernel option to boot the OS entirely in RAM and be able to remove the USB key, this way the USB key does not add to the power consumption (I kept the USB-A card though). In this configuration I noticed a sleep consumption of ~1.8w (so more than with my installed OS at ~1.0w but that’s expected as there’s no tuning whatsoever here.). But the “good” news is that after a couple days and dozens of sleep and unplugging cycles, I could not reproduce the issue.

So I guess next step I’ll try to remove some of the recommended tunning to see if it is actually causing this problem. Maybe the noacpi=1 option ? I tried removing this one first as it’s most likely to be the difference, and as expected without it I get ~1.8w consumption asleep. I’ll confirm if I can reproduce or not without this option.

Appreciate the details. We have some great data from our commuinity thanks to posts like this.

Because this is an active issue we’re working on behind the scenes, what I can tell you is as follows:

  • We’re working on this from the BIOS level, overall.
  • We recommend using Deep Suspend.
  • We recommend removing any non-used expansion cards during suspend as they will absolutely draw power at this time.
  • If you’re wanting the best experience possible, I recommend suspend to hibernate. This is what I use on my own Framework.
  • Using TLP, along aside other tools, we have community members getting a lot more (awake battery) performance from their power cycles. For suspend, don’t expect it to match that you’d get with Windows at this time. Suspend then hibernate is always going to be the best performer. Article coming how to do this soon.

Tltr: We appreciate this feedback, this is NOT something on the back burner, it’s something we’re actively working on.

11 Likes

@Matt_Hartley thank you for acknowledging this.

For the record in my latest tests removing nvme.noacpi=1, I could not reproduce the problem. I do get a the expected higher consumption of course (~1.8w versus 1w), but at least this “random” problem is gone. Hopefully this information can help you pinpoint the origin.

I’ve tried Deep Suspend some time ago but found weird that it was taking 15s to resume (more time than it take to boot the laptop from power off), maybe because I have 32G of RAM, and was not saving much consumption (people already discussed this plenty in the other threads, some expansion card seems to be causing more consumption increase in deep sleep). But I’ll try it again to see if it has any impact on my problem here. Even if I don’t gain much battery life and the resume time is super slow, if at least I get a stable suspend I can rely on it’ll be something.

And finally about hybernate I tried setting it up at some point but gave up because I’m using an encrypted partition which can’t be used for that so I would need to repartition my drive again to add another one just for hybernate (which is already pretty complex due to windows + linux + shared partition setup). For a resume performance which is gonna be at least as slow as deep sleep, so to me this is not worth the trouble and risk. As the battery life in s2idle and deep sleep is good enough to me (when it doesn’t go crazy).

Is there other things I could test to help you guys?

Thanks again for your work on this.

1 Like

Guide is coming. It’s on my to do list. Ideally soonish. I’m working on other projects that need to be done, but, this is absolutely on my list to get it out (for encryption essp).

Once I have a guide done and I have had success with it, I’ll post it for the community to help dog food this. :slight_smile:

1 Like

I wrote a guide too on the matter. It still is very difficult to setup depending on your distro (I’m not sure you would need to repartition though), but I’m leaving it here in case it helps

https://blastrock.github.io/fde-tpm-sb.html

And thanks to the framework team for working on this battery issue!

4 Likes

@Daouadi_Philippe thanks for this link, but I still don’t want to spend time fiddling with such complicated setup just to circumvent a bug for myself (meaning everybody else have to waste the same amount of time). I’d rather spend my time helping to get it fixed for everybody instead.

I’ve tested the deep sleep mode again during the last 12 days and I can confirm the bug is not present in this mode. I did not reproduce once across dozens of sleep sessions. This workaround has the advantage of being very simple to setup (mem_sleep_default=deep in GRUB_CMDLINE_LINUX_DEFAULT), and the downside is that you get very minor battery life improvement and a 10s+ resume time. I’m gonna use this for the time being to avoid the bug but will happily do more testing with s2idle mode when there’s any potential fix released.

Hi @Matt_Hartley, has this guide been finished meanwhile?

1 Like

No, not at this time. We have a two person Linux cx support team and we’re pretty deep into other tasks for the foreseeable future.

3 Likes