Did anybody create a diff of the TLP configuration?
I think i have part of the reason why this is happening. I looked at this using fedora 37, and it seems the gpio interrupt is pinned to one core, and the i2c interrupts are pinned to a separate core. This means every time the touchpad fires an interrupt, both a high performance and efficiency core cluster have to wake up to service the touchpad! And these are probably not even in the same cluster, so lots of cache evictions etc might be happening as both core clusters power on and off and caches are cleared and filled.
What core type are TP interrupts handled by?
The second idea was to investigate what core the interrupts are handled by:
First you can look at the Cores on the system to map them to E or P cores. The E cores will have lower frequencies when looked at using:
lscpu --all --extended
The second thing was to find out what cores the interrupts were handled by:
cat /proc/interrupts
watch -n 1 "cat /proc/interrupts | grep designware"
watch -n 1 "cat /proc/interrupts | grep PIXA"
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 CPU8 CPU9 CPU10 CPU11 CPU12 CPU13 CPU14 CPU15 CPU16 CPU17 CPU18 CPU19
43: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 43694516 0 0 0 0 IR-IO-APIC 43-fasteoi i2c_designware.2, idma64.2
135: 0 0 1711105 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 intel-gpio 3 PIXA3854:00
And move the touchpad, you will see which core the touchpad interrupts are increasing on both.
However the PIXA GPIO interrupt is pinned to a high performance core, but the i2c-designware interrupt is pinned to an efficiency core.
Lets pin them both to the same core CPU2:
# echo 00004 > /proc/irq/43/smp_affinity
# echo 00004 > /proc/irq/135/smp_affinity
# DOESNT WORK, IO error?
Power consumption when using the touchpad seems to drop from about 7W to 5W! (Baseline is 4W)
I would like to pin the PIXA intel-gpio interrupt to an efficiency core, but I have not figured out how to do this.
Wow, thatās an awesome find!
I replicate this on EndeavourOS (Arch), Zorin (Ubuntu), and a Fedora live USB environment. My dstat
readouts closely resemble those posted by others above. EndeavourOS is somewhat worse, and Fedore is somewhat less-terrible. On Endeavour, a USB mouse throws about half as many interrupts as the trackpad; I didnāt think to test this on the other OSās.
I seem to be seeing the same thing - the PIXA interrupt is going HAM on my first (P) core.
Which identifier does one pass to SMP Affinity from the following? I understand that the e cores are the second block, however what is the identifier? Its currently set to 00c0
for btoh, and the smp_affinity_list
set to 6-7
I tried to investigate this. I really donāt know some things written in this thread. I tried to find a clueā¦
- Official document: SMP IRQ affinity ā The Linux Kernel documentation
- [RFC PATCH 0/5] irq: sysfs interface improvements for SMP affinity control : It seems this patch is to add the sysfs interface (
/sys/fs/
?) to control the SMP affinity. I am not sure if the patch is merged or not. - found the 2 scripts.
For anyone testing this, hereās a minimally nicer command:
watch -n 1 "cat /proc/interrupts | grep 'CPU\|designware\|PIXA'"
Hereās a simple systemd service touchpad-smp-affinity.service
that pins the affinity of the designware interrupt to CPU2:
[Unit]
Description=Pin touchpad interrupts to CPU2
Documentation=https://community.frame.work/t/tracking-touchpad-interrupts-battery-usage-issues-idma64-2/13630
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/bin/sh -c 'echo 2-2 > /proc/irq/$(grep designware.2 /proc/interrupts | cut -d ":" -f1 | xargs)/smp_affinity_list'
ExecStop=/usr/bin/sh -c 'echo "0-$(nproc --all --ignore=1)" > /proc/irq/$(grep designware.2 /proc/interrupts | cut -d ":" -f1 | xargs)/smp_affinity_list'
[Install]
WantedBy=basic.target
Test this by running
watch -n 1 "grep 'CPU\|designware\|PIXA' /proc/interrupts"
and checking if the numbers increase in the same column if using the touchpad.
I donāt think this is the final solution. As I understand from the thread, the number of generated interrupts is still too high. But this workaround is apparently better than nothing.
the large number of interrupts generated are probably from the i2c controller on the SOC, as each physical interrupt requires the CPU to transfer multiple bytes. A more optimized Linux kernel driver may be able to fix this.
Is there anyone here whoās willing to report this to the kernel folks?
Thanks everybody for the large amount of research that has gone into this.
From reading I understand that pinning interrupts to a certain core is a workaround that some people test with, but the process raises a few questions for me. I hope someone can shed a light here
-
Why only pinning the
designware.2
interrupts? On my 11th Gen I seedesignware.{0,1,2}
interrupts listed in/proc/interrupts
.
What are the other 2 for? -
Why pinning to CPU 2?
When I executelscpu --all --extended
multiple times over a range of a few minutes I see different cores with a high frequency - including CPU2. According to @Kieran_Levin the high frequency is how to recognize a P core, but if the E vs P core is not constant, the pinning ends up to be a hit & miss, no?
Is there a way to dedicate a CPU as E/P when on battery? -
Why not pinning the
PIXA
interrupts too? Just to be explicit that both interrupts are actually handled by the same CPU.
For some reasion, it is not possible to pin the PIXA interrupt, so only thing we can try is to pin the disignware interrupt and assign it to the same core as the PIXA.
True you canāt pin this interrupt during runtime, but using the kernel param irqaffinity
one can tell the kernel which CPUs to use for hardware interrupts.
Using irqaffinity=0,1
I was able to move the PIXA
interrupt from CPU 7
to CPU 1
.
I havenāt fully tested the impact to power consumption and/ or performance (when charging). Thereās also a question how reasonable it is to set irqaffinity
to a single CPU core?
Pondering on how to resolve my questions above, the plan seems to look like
- Pin hardware IRQs to 2 (or 1 - this needs more research) CPU using kernels
irqaffinity
- Pin
designware
CPU to the same as PIXA, if more than 1 core was listed inirqaffinity
, otherwise skip this step - Ensure the CPU used for the touchpad interrupts is an E core when on battery.
Step 3 is the one I was struggling with on my machine (11th Gen Framework), until I stumbled over
https://www.makeuseof.com/intel-cpus-explained-what-are-e-cores-and-p-cores/ :
E/P cores is a feature since Intels 12th Gen chips, which explains why the CPU frequency behaviour @Kieran_Levin explained did not match my observations.
Also found this during research: according to x86 64 - How to detect E-cores and P-cores in Linux alder lake system? - Stack Overflow E cores can be identified (if SMT is enabeld in BIOS) by looking at the CORE
column for core unique core-numbers - they represent the E cores in the system.
@dfh there is another i2c bus connected to the EC for i2c-hid events that are for some keyboard hotkeys and the ALS.
The other one should be the DDR interface to get the module information.
Both of these are pretty low frequency updates so have diminishing returns, and DDR will not have any data transactions happening on it after boot.
I donāt know what Iām doing and am in way over my head (hoping someone with more knowledge can chime in, or tell me if Iām totally wrong), but I think I may have found why there are so many designware interrupts.
Looking at what is happening in the interrupt handler (in linux/drivers/i2c/busses/i2c-designware-core.h
in the funcion i2c_dw_isr
with a focus on i2c_dw_read
), the system reads 38 bytes from i2c. However it does so at a rate of 2 bytes/interrupt (rx_valid
, which dictates the number of bytes to read, is 2). Tap/click events seem to be sent more reasonably with 38 bytes over 3 interrupts (rx_valid
is between 7-17). So while there may only be ~140 interrupts/sec for the touchpad, whatever related transfer the designware i2c is doing causes a lot more. Presumably, data transfered from touchpad movements can be buffered more (like it is for clicks) so more is read at once.
rx_valid
is read from the register at offset 0x78
(macro DW_IC_RXFLR
in i2c-designware-core.h
). Iād guess that something like this would be controlled on a firmware level rather than kernel/driver, and is something that the HW is capable of considering there click inputs seem to be handled with higher rx_valid
values.
Also, according to dmesg times, sequential reads within the same interrupt are around 5 microseconds (including time for printk
s and aside from the first 3 reads for larger rx_valid
values which are closer to 20 microseconds for some reason), while the time to read between interrupts is around 50-100 microseconds (usually around 50). Besides reducing the number of context switches, this should mean much less time spent on interrupts if there are more reads/IRQ.
Some other findings:
I canāt write to smp_affinity_list
for PIXA also, but itās set to 0-15
(or 10
sometimes?) for me and after a resume (from sleep or hibernate), the interrutps are handled by cpu0
instead of cpu2
.
The high number of interrupts when using the touchpad is also on windows. You can measure this with Performance Monitor or run this in powershell: typeperf "\Processor Information(_Total)\Interrupts/sec" "\Processor Information(_Total)\% Interrupt Time"
. Interrupt time seems to stays pretty low though (usually around 0.12-0.25%).
From perf top
, a lot of % kernel time (more than i2c_dw_read
) is spent on intel_gpio_irq
(in pinctrl-intel.c
, with most time spent specifically between the spin lock in intel_gpio_irq_handler
. . Related device is INTC1055:00
). Iām not sure if thatās normal (when the system is near idle). I also noticed a lot the reads determine that the pins arenāt enabled. This may be a kernel driver optimization, but if there was some way to tell if a pins are disabled before spending the time reading them, that should help (I think it can be configured in the driver since Tiger Lake?). This probably doesnāt help Framework much though itās in the kernel/drivers.
Assuming Iām not just completely wrong and this helps at all, I can fix up some of my notes and post them on github or something.
Canāt figure out how to edit my previous comment, but I forgot to mention the pretty important part that the kernel only ever reads data from I2C when the devices sends an interrupt saying that it canāt store anything else (as in DW_IC_INTR_RX_FULL
is set), thus the kernel has to read the data and there shouldnāt be any optimizations that can be done (unless Iām misunderstanding and the kernel influence when the hardware triggers RX_FULL
).
I stumbled over this reddit post that talks about blacklisting kernel modules i2c-designware-pci
or intel_lpss_pci
.
I wonder if anyone has given this a try, possible in combination with the PS/2 Mouse emulation
BIOS option?
In the reddit thread, the reported touchpad is elan
, so the difference in hardware could yield different results?!
Disabling i2c-designware-pci
or intel_lpss_pci
both seem to do the same thing of forcing the touchpad to fall back to PS/2 emulation (or not work if PS/2 emulation is disabled in bios). This does drop (presumably just handled) interrupts from ~2.8k to 400-600 handled by i8042
, but this also makes it so scrolling and right click donāt work.
Side note: blacklisting i2c_designware
with a modprobe config doesnāt work for me, but adding the kernel option initcall_blacklist=dw_i2c_init_driver
does work, but then idam64
gets spammed with around 5-8k interrupts per second when using the touchpad (this doesnāt happen if i2c_designware_pci
is a module and you can rmmod
it. Havenāt tested modprobe blacklist with it built as a module). If you go that route of PS/2 emulation and have this issue, youāll want to disable idma64
somehow (may have to be delayed if it is required for startup. Not sure).
I would not recommend falling back to ps2 emulation. As this disables a lot of basic functionality. Right click does not work. And there is no multi-touch support.
We have had some discussions with Intel around this issue. And they requested us to track this upstream. Going to track this here: 218169 ā [Framework Laptop] High power consumption when i2c hid touchpad is in use
Thanks for this.