After about 2 months of active testing, this is a compendium based on my results and the insights of others here. Thanks everybody!
Recommended Steps
a) run PPD as advised by FW and set it as needed
b) Install & run a 6.5 or newer kernel, alternately enable amd_pstate via a kernel cmd
c) Install the fix from @Mario_Limonciello to reconnect PPD and amd-pstate (or courtesy of @efindusfor the AUR) No longer needed as of PPD 0.20.
d) Turn off the FW13’s physical switches for the camera & mic unless in use
e) Enable autosuspend of HDMI & DPI expansion cards if those are present No longer needed as updated in udev (currently seeing v240.11-0ubuntu3.12 on Ubuntu 22.04.4)
f) As desired, set abm level to decrease display power draw
Optional
Fedora already defaults to pipewire, for distributions that don’t, it may be worthwhile to migrate to it.
Likely Results
My observations were a reduction in avg draw on battery of 7W+ to < 5W for my std daily workloads. On my FW13 7640u with the 55W battery, this equates to an increase in runtime from ~8 hrs to over 11 hrs.
Caveats
As noted elsewhere, video playback has a higher draw than it should. My testing showed less draw with hardware acceleration. Others report higher draw with HW accel. As driver & firmware optimisations continue, we should all expect this to reach parity with other OSes.
The Phoenix/Ryzen 4 architecture still has upstream optimisations in process or trickling down to the latest kernel builds. Which again leaves the opportunity for longer runtimes as this become part of distributions’ standard kernels.
The MediaTek mt7921 wifi driver also awaits optimisation as large downloads can draw impressively high power.
Not Needed
Initially, using powertop to calibrate and auto-tune seemed advisable. However after testing the above settings with and without auto-tune, there seems to be little advantage.
Likewise, my initial testing showed TLP reduced draw more than PPD. However, with the combination of settings above, this difference dissipated. YMMV.
I have a question regarding bluetooth : when bluetooth is up but not connected, according to powertop it takes arroung 3Wh and awake 100% of the time. Did you investigate this part ?
Yes, my default is to turn off bluetooth when not in use. I take powertop’s readings with a grain of salt. To understand what something is really using, I’ll typically enable it for awhile (my test target has been 10% battery usage to smooth out fluctuations) then disable it for the same period. Then calculate and compare avg power draw. I’ve found the delta to be more accurate than the W usage reported in powertop.
If your use case includes keeping bluetooth enabled, you may wish to set the IdleTimeout to lower energy consumption when not in use.
Hardware acceleration does help with higher resolutions and I do recommend keeping it on as software decoding only outperforms it below 1080p and even there not by much. The problem is just that the hardware decoder at this point uses way more power than it should (or does on windows) so video playback still makes quite a big difference.
Something else to consider, although its benefits are not vastly agreed upon, is disabling individual CPU cores if you don’t need them for your workflow.
I didn’t take proper measurements so far, but it really seems to have a positive impact on battery life at least on my system.
This can be done by echo-ing to a file for each CPU:
echo '0' | tee /sys/devices/system/cpu/cpuX/online
Where X is the number of the CPU core (starts at 1).
As doing that for each of them is cumbersome I made a CLI tool that helps with scaling to a given number of cores, if anyone is interested.
It really depends on the workload if this will have a good battery impact for the system or not. The APU generally performs most efficiently with all cores active. This is because of common IP for all cores that need to be lit up for any cores to be active.
For example if you have a multi-threaded task that nominally takes 30s when equally split across 4 cores and you take 2 cores offline that same will logically take ~45s.
Your outcome will be:
The task taking 50% longer.
The common blocks being active longer.
The individual cores being inactive longer.
My hypothesis is that running a multi threaded workload you’ll not be any better off but if you’re running single threaded workloads across a bunch of cores you will be.
I think to really prove this out what you’ll want to do is come up with a few repeatable workloads that are representative that you can script and benchmark.
Also; if you haven’t already applied it you should consider applying the amd-pstate preferred core patch series. This will make sure that the scheduler is biased to the most efficient cores.
Have you tested PCI autosuspend (which is one of the things TLP and powertop’s auto tune does by default), to see how much it saves by leaving that on?
This will depend on what distro you are using. Some distro’s will have it installed as standard, others you will be able to download it via that distro’s package manager.
Best guidance for this is to Google how to install PPD for your revelant distro and also check that if you are using PPD, that you uninstall TLP and/or any other power management tools.
I’m curious: do people set their ppd mode to power-saver on battery or leave it on balanced? I just noticed that it doesn’t switch by default (at least, not for me since I’m not using a DE).
I have a script that switches to power saver when I disconnect from power supply, plus some wrappers around my build scripts that set performance on build, then drop back to power saver if on battery. I also leave libvirtd and associated bridges disabled by default, enabling them when needed, mostly during functional testing multiple VMs, then disabling them (though I think it doesn’t help once they’ve been activated?). All of this is probably, more-likely, most definitely overkill, but it’s what I do.