[TRACKING] Touchpad interrupts & battery usage issues (idma64.2)

Did anybody create a diff of the TLP configuration?

I think i have part of the reason why this is happening. I looked at this using fedora 37, and it seems the gpio interrupt is pinned to one core, and the i2c interrupts are pinned to a separate core. This means every time the touchpad fires an interrupt, both a high performance and efficiency core cluster have to wake up to service the touchpad! And these are probably not even in the same cluster, so lots of cache evictions etc might be happening as both core clusters power on and off and caches are cleared and filled.

What core type are TP interrupts handled by?

The second idea was to investigate what core the interrupts are handled by:

First you can look at the Cores on the system to map them to E or P cores. The E cores will have lower frequencies when looked at using:

lscpu --all --extended

The second thing was to find out what cores the interrupts were handled by:

cat /proc/interrupts

watch -n 1 "cat /proc/interrupts | grep designware"  
watch -n 1 "cat /proc/interrupts | grep PIXA"
            CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7       CPU8       CPU9       CPU10      CPU11      CPU12      CPU13      CPU14      CPU15      CPU16      CPU17      CPU18      CPU19      
  43:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   43694516          0          0          0          0  IR-IO-APIC   43-fasteoi   i2c_designware.2, idma64.2
 135:          0          0    1711105          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  intel-gpio    3  PIXA3854:00

And move the touchpad, you will see which core the touchpad interrupts are increasing on both.
However the PIXA GPIO interrupt is pinned to a high performance core, but the i2c-designware interrupt is pinned to an efficiency core.

Lets pin them both to the same core CPU2:

# echo 00004 > /proc/irq/43/smp_affinity
# echo 00004 > /proc/irq/135/smp_affinity # DOESNT WORK, IO error?

Power consumption when using the touchpad seems to drop from about 7W to 5W! (Baseline is 4W)

I would like to pin the PIXA intel-gpio interrupt to an efficiency core, but I have not figured out how to do this.

9 Likes

Wow, that’s an awesome find!

I replicate this on EndeavourOS (Arch), Zorin (Ubuntu), and a Fedora live USB environment. My dstat readouts closely resemble those posted by others above. EndeavourOS is somewhat worse, and Fedore is somewhat less-terrible. On Endeavour, a USB mouse throws about half as many interrupts as the trackpad; I didn’t think to test this on the other OS’s.

I seem to be seeing the same thing - the PIXA interrupt is going HAM on my first (P) core.

Which identifier does one pass to SMP Affinity from the following? I understand that the e cores are the second block, however what is the identifier? Its currently set to 00c0 for btoh, and the smp_affinity_list set to 6-7

I tried to investigate this. I really don’t know some things written in this thread. I tried to find a clue…

1 Like

For anyone testing this, here’s a minimally nicer command:

watch -n 1 "cat /proc/interrupts | grep 'CPU\|designware\|PIXA'"
1 Like

Here’s a simple systemd service touchpad-smp-affinity.service that pins the affinity of the designware interrupt to CPU2:

[Unit]
Description=Pin touchpad interrupts to CPU2
Documentation=https://community.frame.work/t/tracking-touchpad-interrupts-battery-usage-issues-idma64-2/13630

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/bin/sh -c 'echo 2-2 > /proc/irq/$(grep designware.2 /proc/interrupts | cut -d ":" -f1 | xargs)/smp_affinity_list'
ExecStop=/usr/bin/sh -c 'echo "0-$(nproc --all --ignore=1)" > /proc/irq/$(grep designware.2 /proc/interrupts | cut -d ":" -f1 | xargs)/smp_affinity_list'

[Install]
WantedBy=basic.target

Test this by running

watch -n 1 "grep 'CPU\|designware\|PIXA' /proc/interrupts"

and checking if the numbers increase in the same column if using the touchpad.

I don’t think this is the final solution. As I understand from the thread, the number of generated interrupts is still too high. But this workaround is apparently better than nothing.

the large number of interrupts generated are probably from the i2c controller on the SOC, as each physical interrupt requires the CPU to transfer multiple bytes. A more optimized Linux kernel driver may be able to fix this.

Is there anyone here who’s willing to report this to the kernel folks?

4 Likes

Thanks everybody for the large amount of research that has gone into this.

From reading I understand that pinning interrupts to a certain core is a workaround that some people test with, but the process raises a few questions for me. I hope someone can shed a light here

  1. Why only pinning the designware.2 interrupts? On my 11th Gen I see designware.{0,1,2} interrupts listed in /proc/interrupts.
    What are the other 2 for?

  2. Why pinning to CPU 2?
    When I execute lscpu --all --extended multiple times over a range of a few minutes I see different cores with a high frequency - including CPU2. According to @Kieran_Levin the high frequency is how to recognize a P core, but if the E vs P core is not constant, the pinning ends up to be a hit & miss, no?
    Is there a way to dedicate a CPU as E/P when on battery?

  3. Why not pinning the PIXA interrupts too? Just to be explicit that both interrupts are actually handled by the same CPU.

For some reasion, it is not possible to pin the PIXA interrupt, so only thing we can try is to pin the disignware interrupt and assign it to the same core as the PIXA.

True you can’t pin this interrupt during runtime, but using the kernel param irqaffinity one can tell the kernel which CPUs to use for hardware interrupts.

Using irqaffinity=0,1 I was able to move the PIXA interrupt from CPU 7 to CPU 1.
I haven’t fully tested the impact to power consumption and/ or performance (when charging). There’s also a question how reasonable it is to set irqaffinity to a single CPU core?

Pondering on how to resolve my questions above, the plan seems to look like

  1. Pin hardware IRQs to 2 (or 1 - this needs more research) CPU using kernels irqaffinity
  2. Pin designware CPU to the same as PIXA, if more than 1 core was listed in irqaffinity, otherwise skip this step
  3. Ensure the CPU used for the touchpad interrupts is an E core when on battery.

Step 3 is the one I was struggling with on my machine (11th Gen Framework), until I stumbled over
Intel CPUs Explained: What Are E-Cores and P-Cores? :

E/P cores is a feature since Intels 12th Gen chips, which explains why the CPU frequency behaviour @Kieran_Levin explained did not match my observations.

Also found this during research: according to x86 64 - How to detect E-cores and P-cores in Linux alder lake system? - Stack Overflow E cores can be identified (if SMT is enabeld in BIOS) by looking at the CORE column for core unique core-numbers - they represent the E cores in the system.

1 Like

@dfh there is another i2c bus connected to the EC for i2c-hid events that are for some keyboard hotkeys and the ALS.
The other one should be the DDR interface to get the module information.

Both of these are pretty low frequency updates so have diminishing returns, and DDR will not have any data transactions happening on it after boot.

2 Likes

I don’t know what I’m doing and am in way over my head (hoping someone with more knowledge can chime in, or tell me if I’m totally wrong), but I think I may have found why there are so many designware interrupts.

Looking at what is happening in the interrupt handler (in linux/drivers/i2c/busses/i2c-designware-core.h in the funcion i2c_dw_isr with a focus on i2c_dw_read), the system reads 38 bytes from i2c. However it does so at a rate of 2 bytes/interrupt (rx_valid, which dictates the number of bytes to read, is 2). Tap/click events seem to be sent more reasonably with 38 bytes over 3 interrupts (rx_valid is between 7-17). So while there may only be ~140 interrupts/sec for the touchpad, whatever related transfer the designware i2c is doing causes a lot more. Presumably, data transfered from touchpad movements can be buffered more (like it is for clicks) so more is read at once.

rx_valid is read from the register at offset 0x78 (macro DW_IC_RXFLR in i2c-designware-core.h). I’d guess that something like this would be controlled on a firmware level rather than kernel/driver, and is something that the HW is capable of considering there click inputs seem to be handled with higher rx_valid values.

Also, according to dmesg times, sequential reads within the same interrupt are around 5 microseconds (including time for printks and aside from the first 3 reads for larger rx_valid values which are closer to 20 microseconds for some reason), while the time to read between interrupts is around 50-100 microseconds (usually around 50). Besides reducing the number of context switches, this should mean much less time spent on interrupts if there are more reads/IRQ.

Some other findings:

I can’t write to smp_affinity_list for PIXA also, but it’s set to 0-15 (or 10 sometimes?) for me and after a resume (from sleep or hibernate), the interrutps are handled by cpu0 instead of cpu2.

The high number of interrupts when using the touchpad is also on windows. You can measure this with Performance Monitor or run this in powershell: typeperf "\Processor Information(_Total)\Interrupts/sec" "\Processor Information(_Total)\% Interrupt Time". Interrupt time seems to stays pretty low though (usually around 0.12-0.25%).

From perf top, a lot of % kernel time (more than i2c_dw_read) is spent on intel_gpio_irq (in pinctrl-intel.c, with most time spent specifically between the spin lock in intel_gpio_irq_handler. . Related device is INTC1055:00). I’m not sure if that’s normal (when the system is near idle). I also noticed a lot the reads determine that the pins aren’t enabled. This may be a kernel driver optimization, but if there was some way to tell if a pins are disabled before spending the time reading them, that should help (I think it can be configured in the driver since Tiger Lake?). This probably doesn’t help Framework much though it’s in the kernel/drivers.

Assuming I’m not just completely wrong and this helps at all, I can fix up some of my notes and post them on github or something.

4 Likes

Can’t figure out how to edit my previous comment, but I forgot to mention the pretty important part that the kernel only ever reads data from I2C when the devices sends an interrupt saying that it can’t store anything else (as in DW_IC_INTR_RX_FULL is set), thus the kernel has to read the data and there shouldn’t be any optimizations that can be done (unless I’m misunderstanding and the kernel influence when the hardware triggers RX_FULL).

I stumbled over this reddit post that talks about blacklisting kernel modules i2c-designware-pci or intel_lpss_pci.

I wonder if anyone has given this a try, possible in combination with the PS/2 Mouse emulation BIOS option?

In the reddit thread, the reported touchpad is elan, so the difference in hardware could yield different results?!

1 Like

Disabling i2c-designware-pci or intel_lpss_pci both seem to do the same thing of forcing the touchpad to fall back to PS/2 emulation (or not work if PS/2 emulation is disabled in bios). This does drop (presumably just handled) interrupts from ~2.8k to 400-600 handled by i8042, but this also makes it so scrolling and right click don’t work.

Side note: blacklisting i2c_designware with a modprobe config doesn’t work for me, but adding the kernel option initcall_blacklist=dw_i2c_init_driver does work, but then idam64 gets spammed with around 5-8k interrupts per second when using the touchpad (this doesn’t happen if i2c_designware_pci is a module and you can rmmod it. Haven’t tested modprobe blacklist with it built as a module). If you go that route of PS/2 emulation and have this issue, you’ll want to disable idma64 somehow (may have to be delayed if it is required for startup. Not sure).

I would not recommend falling back to ps2 emulation. As this disables a lot of basic functionality. Right click does not work. And there is no multi-touch support.

1 Like

@Kieran_Levin
Thanks for elaborating the impact :+1:

We have had some discussions with Intel around this issue. And they requested us to track this upstream. Going to track this here: 218169 – [Framework Laptop] High power consumption when i2c hid touchpad is in use

4 Likes

Thanks for this. :slight_smile: