If you reproduce what I believe is thermal throttling again, please capture the /sys/kernel/debug/gpio file and then capture it again when it’s not reproducing.
I believe there should be a GPIO to indicate thermal throttling is in use, but it would need cross referencing against the framework hardware design to confirm.
If the GPIO values are the same there are some MSRs to capture to isolate if it’s a kernel bug. Basically the CPPC ones used by amd-pstate.
@Steiner
I guess the real question is what is “gpe10”. I.e. what is it connected to?
I have seen this sort of behavior on other non-framework laptops. For example on another laptop, gpe13 was being triggered, and it was traced back to faulty firmware on SSD disks. The workaround/fix was just like you have done, and just disable gpe13.
But, it would be nice to understand which device is linked to gpe10 on the FW16.
It looks like I was also having a similar issue. Unplugged from a ThinkPad TB4 dock, GPE10 counts would stay flat, plug it in, and the count would rise rapidly, switching between output like below, and the same but with EN missing:
❯ cat /sys/firmware/acpi/interrupts/gpe10
2200 EN enabled unmasked
One of the cores would either stay pegged between ~50% and ~100% usage. Updating the bios to 3.03 and the TB dock’s firmware seemingly solved it for the most part.
Initially when I had everything reboot after the firmware upgrades, the gpe10 values were still climbing, albeit much slower, and the EN values were always there when I checked, but unplugging the power cable and replugging it in stopped the values from climbing ever again. (The dock can be kinda funky with whether it will charge the FW16 or not at times, hence me resorting to using the FW16 charger on another port)
So far things are smoother than before when using a dock, currently using Tumbleweed + PPD + kernel 6.8.7
Edit: After going to sleep/waking the system up enough times, looks like it’s GPE10 interrupts are rising again, just current CPU usage sitting around 15-20%
Very true. I’d rather not disable hardware stuff willy-nilly personally.
It’s definitely related to something about the TB4 dock and whatever state things are after resuming from sleep. When it’s happening, unplugging the dock brings cpu back down and stops the interrupt avalanche. Reconnecting the dock brings it back.
I just want to chime in with a report of the same.
When docked to a Lenovo TB4 dock, gpe10 goes wild until I disable it. I can then immediately re-enable it again and everything is fine until some sort of ACPI/PM event occurs like if dock-connected displays go to sleep and then wake back up.
Strangely, I can easily trigger this 100% consistently, while docked, by restarting the libvirtd systemd service. This seems to be related to libvirt querying the kernel for PM capabilities.