[TRACKING] Linux battery life tuning

Great summary! It may be good to add the warning that after resuming from suspend to s2idle, power consumption is way higher because only power states down to C3 are used (see Issues with power usage - #5 by Brad_J). This would seem a straight-up bug (not sure it is the kernel or bios). After a deep suspend, things are back to normal. But resuming from deep suspend takes rather long (also a bug?).

Maybe useful for others, my measurements with powertop (i7, 32GB, 3.07 bios; Debian/KDE,X11,pipewire, konsole only, radios off (flight mode), removed HDMI extension card

  • screen at 5%: 2.0 W, CPU 95% in C10
  • screen at 50%: 3.1 W
  • screen at 100%: 4.3 W
  • screen at 5%, insert HMDI: 2.9 W
    • after autosuspend HDMI: 2.4 W
  • screen at 5%, HDMI removed, radios on: 2.2 W (connected or not makes little difference)
  • after s2idle suspend, radios off: 4.4 W; EDIT: this is solved with nvme.noacpi=1 - see Linux battery life tuning - #156 by technical27
  • after deep suspend: 2.0 W

Overall, idle power of ~2W (presumably, window manager turns screen off; ~2.4W at 100%) seems very reasonable! But concrete things/bug fixes that would make it generally happen are:

  • Ensure regular state after s2idle resume (and, it sounds like, make s2idle better!); EDIT: regular state can be ensured - see Linux battery life tuning - #156 by technical27
  • Ensure HDMI dongle uses less power (beyond what is gained from suspending it)
  • Somehow make resume from deep suspend faster?

p.s. For reference, my old X1 Yoga 2nd gen also has an idle power consumption of ~2W.

1 Like

Just reproduced this and I was a bit baffled. I haven’t previously come across a description of this bug and never noticed it myself. It adds ~2.5W of idle power burn on mine. Has anyone done research on whether this is a hardware/firmware or kernel bug?

@lightrush, this post from further up in this thread may help: Linux battery life tuning - #156 by technical27

3 Likes

@lbkNhubert confirmed, it does resolve the problem. I’ll add that to the Ubuntu post-install formula even though most people using it probably have deep sleep as that’s the default. I actually use s2idle with suspend-then-hibernate so I’m affected. Thanks for the sleuth work @technical27 !

Edit: Aaand done. Whoever’s using this will automagically get the workaround applied.

2 Likes

Confirmed as well here on noacpi=1 reducing s2idle power draw. We’re checking why that is.

14 Likes

@TJ1 - the information from Linux battery life tuning - #156 by technical27 would seem worth adding to your suspend section on top! Thanks so much, @technical27!

Hi all,

I noticed a rather high power drain by the USB-A extension card.

My system: i7-1165G7, 64GB RAM 3200Mhz, Samsung 980 PRO 1TB. ArchLinux (kernel 5.17.3).

My settings: TLP default settings, no extension cards plugged, radios off, backlight 1%

With this I get ~2W idle power usage (according to powertop). However, as soon as I plug in a USB-A extension card, the power consumption increases by ~300-400mw for each USB-A card (tried with two cards). USB-C extension cards did not affect the power consumption.

Did anyone else experience this?

Regards,

Robert

@robert_b - I thought USB A was better than HDMI, but I just checked and confirmed that the USB A dongle takes about 300 mW. Without my HDMI and USBA, with radios off and 5% display, I got about 1.7 W idle (after waiting quite a while).

It’s been mentioned quite a few times in this thread, and elsewhere in this forum.

Here’s the best explanation I’ve seen as to why: Switchable USB-A and HDMI adapters - #9 by Luke_Mahowald

BIOS 3.08 is supposedly going to address battery drain during shutdown and possibly hibernate and standby. I wonder if they can address the drain of keeping usb-a’s ready to accept a device using the same methods.

Good news about framework is you have the option of the removing the cards you don’t need to eke out a bit more autonomy, unlike other laptops…

1 Like

Does the power draw scale linearly with number of Type-A adapters? I don’t remember if I tried that experiment and I have been operating off the assumption that the power draw is from the controllers.
Native Type-A ports are usually connected to the PCH XHCI and won’t have these power draw issues which are TCSS related.

If I’m interpreting my results correctly, it scales linearly with the number of sides of the machine the USB-A’s are in. I get the same change to current_now with one or two USB-As on the same side, but double the change if I put one on each side.

There’s a lot of noise in the data, though, so don’t take my theory as gospel…

Hi all.

I’m using this Add-On in LibreWolf/Firefox for a week now and have the perception that CPU-Usage is lower. It’s not Framework or Linux specific.

In my understanding it does mute tabs unused like Edge or Safari are doing to improve battery life.

Has anyone seen the effect also? Did not test it or verify it via powertop or so.

Simon

1 Like

Just ran my first serious test of all my Framework/Manjaro battery tuning: getting work done while flying from Boston to Belize–one stopover, a total of 8 hours flight time.

Conditions:

  • Laptop started at its 90% charge limit.
  • No expansion card installed except one USB-C.
  • Wi-fi and bluetooth turned off.
  • Powertop runs auto-tune automatically; Gnome power control set to “Power Saver”.
  • Display most of the way down (which is fine in an airline cabin)

The system showed about 4.7-5W consumption most of the time.

I was basically reviewing a 200 page PDF document, so I wasn’t working the system very hard. I hibernated the system between flights.

After 4 hours of work time (1.5 hrs first flight, 2.5 hrs second flight), the laptop showed 68% charge.

Granted, I would expect a much shorter life if I was working the system harder and running wi-fi, but for offline work in ideal conditions, it looks like 8-10 hours is attainable. Surprised me a bit.


UPDATE
After a major network munge, I reinstalled and updated my Manjaro/GNOME installation. With minimal manual tuning other than “powertop --autotune” (no longer using TLP), I’m now seeing even better battery life. With the power manager set to Power Saver and the backlight at minimal, I’m running 3.5-4.5W on Firefox with 2 tabs, cloud services, and two separate mail clients. Balanced shows 4.5-5.5W. Kernel is 5.15.32-1.

Definitely shouldn’t have trouble making it through my flight to the Grenada Chocolate Festival next week ;- )

12 Likes

I’ve loved watching the information grow in these and other threads to the point where we can get the Framework to sip battery at idle. One thing I’ve been wondering for some time is how we can take advantage of Intel’s configurable TDP to limit battery usage under load. I found this thread that walked us through systemd units for setting limits on wattage through the powercap infrastructure.

Has anyone else played with this to see how it might affect battery usage/performance under load? I’m still saving up for my personal Framework laptop; since I’d like to keep my marriage in tact, I can’t really put my wife’s Framework through the ringer until maybe this weekend. :sweat_smile:

1 Like

While not that, this might be a poor person’s substitute for a TDP limit - a straight up perf limit. :smiley: You’d lose the ability to go fast for brief periods of time but it should be similar for prolonged operation.

1 Like

Thanks for sharing this! I spent just a little bit of time today playing with some system settings. I’m sad to say that I don’t have a complete answer to my original question of how well the powercap framework works, since I was a little perplexed by what turned out to be the tlp defaults. Those of you familiar with intel_pstate, CPUFreq, and powercap will already know everything that I will report here, but novice Linux enthusiasts with a mere 10 years of usage like me may learn something :sweat_smile:

TLDR: I activated the defaults in tlp without realizing that these defaults create strict frequency limits for each threaded process. The good news is, I seemed to get powercap working on my machine. Because of some technicalities, I’m unable to provide a meaningful report on the improvements it provides upon standard tlp.

My stress test and some reading

The Phenomenon

I started out by installing stress-ng and trying to get baseline power statistics with powertop under both a single-core and multicore stress test. To my surprise, I found that the usage did not go above ~12w under these the 6 threaded test, and not above ~10w on the single-core test. I found this baffling since I did not set a percentage limit on max performance, and I’m using the 1135g7 which has a tdp of 28w.

I dug into my settings and confirmed that I did not set the max performance limiting. Yet, when I ran tlp-stat, I was informed that a 50% performance limit was being set.

# reported from tlp-stat
/sys/devices/system/cpu/intel_pstate/min_perf_pct      =   9 [%]
/sys/devices/system/cpu/intel_pstate/max_perf_pct      =  50 [%]
/sys/devices/system/cpu/intel_pstate/no_turbo          =   0
/sys/devices/system/cpu/intel_pstate/turbo_pct         =  47 [%]
/sys/devices/system/cpu/intel_pstate/num_pstates       =  39

# i.e. the second line is equivalently this config:
CPU_MAX_PERF_ON_BAT=50

This seemed to explain why the frequency chart on powertop was not reporting above 1.8GHz for any core throughout either a single or multicore stress test.

Intel P-states

At this point, I wanted to know a bit more about what this intel_pstate thing is doing and how it differs from powercap. Here’s what I found:

…the CPUFreq core uses frequencies for identifying operating performance points of CPUs and frequencies are involved in the user space interface exposed by it, so intel_pstate maps its internal representation of P-states to frequencies too…

So pstate is a fancy version of capping CPU frequency—got it! Further:

Since the hardware P-state selection interface used by intel_pstate is available at the logical CPU level, the driver always works with individual CPUs. Consequently, if intel_pstate is in use, every CPUFreq policy object corresponds to one logical CPU and CPUFreq policies are effectively equivalent to CPUs.

Integrating this quote with what we saw previously, here is what I can surmise.

Tlp by default implements pstate-based CPU frequency limiting. Since this limiting is applied to logical CPUs, this ensures that every thread running on the Framework is running at a capped frequency. Consequently, features like turbo boost and even short term boosts in processing are disabled under battery.

Of course, this does provide power advantages, but at the same time it would be nice if I could let one or two threads go above their pstate limit if it still meant having a relatively low power consumption. This means pesky user applications which use way too many CPU cycles (cough Zoom cough) could continue their decadent reign without impacting the user experience.

Researching Powercap

I found this page on the Linux kernel powercap implementation:

[In an example given on the page, t]here is one control type called intel-rapl which contains two power zones, intel-rapl:0 and intel-rapl:1, representing CPU packages. Each of these power zones contains two subzones, intel-rapl:j:0 and intel-rapl:j:1 (j = 0, 1), representing the “core” and the “uncore” parts of the given CPU package, respectively. All of the zones and subzones contain energy monitoring attributes (energy_uj, max_energy_range_uj) and constraint attributes (constraint_*) allowing controls to be applied (the constraints in the ‘package’ power zones apply to the whole CPU packages and the subzone constraints only apply to the respective parts of the given package individually).

From this quote, we may infer that constraints in the powercap framework are applied to the entirety of the CPU. Further, these constraints are a little more dynamic, since they apply to running averages over time rather than instantaneous limits like pstates:

Depending on different power zones, the Intel RAPL technology allows one or multiple constraints like short term, long term and peak power, with different time windows to be applied to each power zone.

So it seems like the application of powercap would allow me to set a similar limit on power usage, e.g. the 12w that I noticed on multicore workloads, without limiting performance on single-core workloads. E.g. no stuttering when launching a big app, container or VM, which might scale up and down quickly without triggering a powercap constraint. In terms of my benchmarks, I would expect this to allow the single-core stress test to go beyond its 1.8GHz limit under tlp while still limiting the multicore stress test.

What I tried

Without messing with the power-prioritizing governor, I tried simply removing the limit on pstate by adding:

CPU_MAX_PERF_ON_BAT=100

Then, I set a TDP target with the powercap-set command:

sudo systemctl stop thermald
sudo powercap-set -p intel-rapl --zone=0 --constraint=0 -l 14000000 -s 10000000
# I had to add this to enable the "core" components of the CPU package
sudo powercap-set -p intel-rapl --zone=0:0 -e 1

I’m essentially using what was reported in the other thread on powercap, linked in my original post. I liked the idea of pinning to the 14W target as the 10-second running average, since this was close to what I got under the default tlp setting of limiting the pstate and @jbch settled on it after their own testing. It seems like powercap goes all the way down to 12W based on the intel specs for the i5 1135g7, although I haven’t tried putting this even lower. I verified that the system accepted these settings with sudo powercap-info -p intel-rapl.

I also played a bit with the energy policy while using the powercap setting:

CPU_ENERGY_PERF_POLICY_ON_BAT=balance_performance

Interestingly, it did seem to have an impact on the metrics for the benchmark, even though the power draw was still limited by powercap. I.e., it seems like powercap might be “smarter” than this energy policy. More tests might be needed to confirm this, though.

What I haven’t done

I haven’t created the systemd scripts to make this setting stick, nor have I verified my conjectures about single-threaded workloads. I’d have to figure out how to get the stress test to “pin” to a specified core, and that’s just a bridge too far for one evening. For the moment, I’m going to give the laptop back to my wife without messing up her battery life, which is already completely amazing to her as it is!

You might have noticed that tlp by default also limits the turbo boost pstate to 47%. It would be interesting to see if I could uncap that as well in combination to gain more performance improvements on battery.

Conclusion

I hope this helps someone out there who is tinkering with their system, or who is trying to figure out how to use Intel’s “TDP-down” configuration in Linux. Let me know if my understanding is wrong, since I may have missed something in my sleuthing! I welcome any thoughts about how to measure CPU frequency under stress testing.

4 Likes

I thought that ~5hours was ok, but now I realized that I should expect something like 10hours.

I did all that was posted here and in other posts on this forum and I cannot get below C3 on idle. The lowest power I managed was ~9W.

I installed powertop and optimized and tlp including PCIE_ASPM_ON_BAT=powersupersave.

OS: KDE neon User - 5.24 x86_64
Kernel: 5.13.0-40-generic

Blacklisting unused SSD expansion cards to save ~1.4W and hit C10

After following all the guides around here, my power consuption was around 5.1 W idle and I only hit C9 but never C10. I found out, that my storage expansion card draws around 1.4W and prevented me from C10. I only use my SSD card to boot Windows, so I decided to blacklist it in my arch install. There is what I did.

This will make your storage card unavailable in your current linux install until you undo this.

1. Find the vendor and product IDs for the card.

~ ❯❯❯ lsusb
Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 003 Device 002: ID 27c6:609c Shenzhen Goodix Technology Co.,Ltd. Goodix USB2.0 MISC
Bus 003 Device 003: ID 8087:0032 Intel Corp. AX210 Bluetooth
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 002: ID 13fe:6500 Kingston Technology Company Inc. USB DISK 3.2
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

We are looking for the ID “13fe:6500”. (Maybe it’s the same for all cards, but please double check.)

2. Create a udev rule to blacklist the device

Change idVendor to the first part of “13fe:6500” and idProduct to the second part.

echo 'SUBSYSTEM=="usb", ATTRS{idVendor}=="13fe", ATTRS{idProduct}=="6500", ATTR{authorized}="0"' | sudo tee /etc/udev/rules.d/01-expansion-ssd.rules

3. Reboot

4. Verifying

Run sudo fdisk -l. The drive should not be listed anymore.

Check powertop. You should now see a lower power usage of around 1.4W and also C10.

Undo

Just remove the udev file and reboot.

sudo rm -f /etc/udev/rules.d/01-expansion-ssd.rules
6 Likes

Thank you for this! I usually keep my expansion cards removed, but it would sometimes be nice to have them installed (e.g. when I’m traveling and just might need USB-A). I worry about losing them if I keep them stored outside the laptop.