[TRACKING] Kworker stuck at near 100% CPU usage with Ubuntu 22.04

Delighted to hear about the update! :slight_smile:

After over a year of using a broken laptop, this update has fixed it all. It’s been ~24 hours and all remains good, after using USB devices, the camera, an ethernet expansion card. What a nightmare it’s been, and the whole time FrameWork in denial – this bit, really, is grating. Thank you for the fix, it’s appreciated, but trust on your users would also have been appreciated. Plus many users reported this issue, not just me.

The last remaining issue is the floppy screen that falls off; hinges are far too weak. Even a video of that didn’t convince framework to send me new ones.

Well, spoke too soon: I’ve just plugged a USB-3 mouse into the USB-3 expansion bay, and now I get:

PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND  
716474 root      20   0       0      0      0 R  86.0   0.0   0:39.06 kworker+ 

… a kworker at 86% CPU usage! Oh, no. Back to modprobe -r xhci-pci.

Looking into dmesg, I see an overcurrent condition soon after plugging the USB mouse (mind you, this mouse works just fine in my desktop computer):

[64539.926576] usb 3-1: new full-speed USB device number 6 using xhci_hcd
[64540.058754] usb 3-1: device descriptor read/64, error -71
[64540.390954] usb 3-1: New USB device found, idVendor=0738, idProduct=1705, bcdDevice= 2.22
[64540.390963] usb 3-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[64540.390967] usb 3-1: Product: Mad Catz R.A.T.5 Mouse
[64540.390969] usb 3-1: Manufacturer: Mad Catz
[64540.442787] usbcore: registered new interface driver usbhid
[64540.442794] usbhid: USB HID core driver
[64540.450345] input: Mad Catz Mad Catz R.A.T.5 Mouse as /devices/pci0000:00/0000:00:14.0/usb3/3-1/3-1:1.0/0003:0738:1705.0004/input/input27
[64540.506965] saitek 0003:0738:1705.0004: input,hidraw2: USB HID v1.11 Mouse [Mad Catz Mad Catz R.A.T.5 Mouse] on usb-0000:00:14.0-1/input0
[64542.534730] usb 3-1: USB disconnect, device number 6
[64542.866645] usb 3-1: new full-speed USB device number 7 using xhci_hcd
[64543.066580] usb 3-1: device descriptor read/64, error -71
[64543.434700] usb 3-1: New USB device found, idVendor=0738, idProduct=1705, bcdDevice= 2.22
[64543.434709] usb 3-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[64543.434712] usb 3-1: Product: Mad Catz R.A.T.5 Mouse
[64543.434715] usb 3-1: Manufacturer: Mad Catz
[64543.438768] input: Mad Catz Mad Catz R.A.T.5 Mouse as /devices/pci0000:00/0000:00:14.0/usb3/3-1/3-1:1.0/0003:0738:1705.0005/input/input28
[64543.439377] saitek 0003:0738:1705.0005: input,hidraw2: USB HID v1.11 Mouse [Mad Catz Mad Catz R.A.T.5 Mouse] on usb-0000:00:14.0-1/input0
[64552.602726] usb 3-1: USB disconnect, device number 7
[64553.210611] usb 3-1: new full-speed USB device number 8 using xhci_hcd
[64553.433441] usb 3-1: New USB device found, idVendor=0738, idProduct=1705, bcdDevice= 2.22
[64553.433449] usb 3-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[64553.433452] usb 3-1: Product: Mad Catz R.A.T.5 Mouse
[64553.433455] usb 3-1: Manufacturer: Mad Catz
[64553.437012] input: Mad Catz Mad Catz R.A.T.5 Mouse as /devices/pci0000:00/0000:00:14.0/usb3/3-1/3-1:1.0/0003:0738:1705.0006/input/input29
[64553.494866] saitek 0003:0738:1705.0006: input,hidraw2: USB HID v1.11 Mouse [Mad Catz Mad Catz R.A.T.5 Mouse] on usb-0000:00:14.0-1/input0
[64565.334709] usb usb3-port1: over-current condition
[64565.334716] usb 3-1: USB disconnect, device number 8
[64565.750798] usb usb3-port3: over-current condition

I tried uplugging the USB-3 mouse, the kworker CPU remains pegged.
Then I tried modprobe -r xhci-pci, kworker goes away; then modprobe xhci-pci: kworker comes back, even though the USB-3 mouse isn’t plugged.
Then I tried modprobe -r xhci-pci, then unplugging the USB-3 expansion bay, then modprobe xhci-pci: the kworker comes back up.

Once the kworker goes awry, it stays bad no matter what. Likely only a reboot resets it.

1 Like

So this just happened to me, it just happened after a reboot, and now won’t go away unless I unload the xhci-pci kernel module. I hadn’t updated, but it was the first time I played around with the battery disconnect feature in the bios. It happened after I rebooted, reconnecting the battery. I saw one other person mention this affecting it, but there wasn’t any discussion on it.
I guess for now I’ll just have to live without the camera. I’m using arch on the 11th gen i5 if that’s needed.

Some questions as we still have not been able to reproduce this on our end.

I’m running the 3.10 BIOS, and am running the latest arch kernel, 6.4.1. I guess I’ll start by updating the BIOS and trying some different kernels. I may try to see if the Ubuntu kernel works too, if I get the time to run through the install.

I was about to switch my kernel, but I checked to make sure that it was still happening, and it has somehow fixed itself after a few restarts. If it happens again, I’ll be back and see if I can help at all, but at the moment it seems to be some weird intermittent issue that is hard to reproduce.

1 Like

Bumping this. I’m getting it with my new AMD 7640U.

Happened at least twice so far. I think it happens when resuming. Happened this evening when waking my laptop connected to power. Mouse cursor was moving maybe 1 frame every few seconds.

Saw kworker at the top of top which switched between a few tasks such as kworker/u32:3-mt76. Didn’t record the others.

Linux 6.1.0-1027-oem x86_64

Description: Ubuntu 22.04.3 LTS
Release: 22.04
Codename: jammy

I’d appreciate any advice on what to capture next time it happens. Keep in mind it took me a solid three minutes to get the mouse centered on the reboot button, so hoping for some concise terminal commands.

This is happening for me as well on a 13" AMD 7840U on Ubuntu 22.04. It had been working fine for a few hours, and then as I was typing the whole system suddenly came to a grinding halt. top showed the same kworker issue as above.

Also happening on 13" AMD 7840U on NixOS unstable

Linux nixos 6.6.7-zen1 #1-NixOS ZEN SMP PREEMPT_DYNAMIC Tue Jan 1 00:00:00 UTC 1980 x86_64 GNU/Linux

At the moment I only have 4 USB type C expansion bay inserted with the following devices plugged in:

EDIT:

After reloading the module it seems that kworker is not spiking CPU usage.

Will update in case I notice some new details.

Also happening on 13" AMD 7640U on Fedora Workstation.

I have two USB C in the rear, a USB A in the left front and HDMI in the right front. I am able to trigger the kworker pegging the CPU at 100% when connected to an external HDMI monitor and removing the HDMI cable with the lid closed.

I was able to ssh in to attempt to kill the kworker process but it just restarts again. The only way I know how to recover the system is to reboot.

I have run a perf sample but I am not sure how to interpret the data. Here is an excerpt:

Samples: 44K of event 'cycles:P', Event count (approx.): 22458763534
  Children      Self  Command          Shared Object                                   Symbol
+   95.17%     0.00%  kworker/9:4+eve  [kernel.kallsyms]                               [k] ret_from_fork_asm                                                          ◆
+   95.17%     0.00%  kworker/9:4+eve  [kernel.kallsyms]                               [k] ret_from_fork                                                              ▒
+   95.17%     0.00%  kworker/9:4+eve  [kernel.kallsyms]                               [k] kthread                                                                    ▒
+   95.17%     0.00%  kworker/9:4+eve  [kernel.kallsyms]                               [k] worker_thread                                                              ▒
+   95.17%     0.00%  kworker/9:4+eve  [kernel.kallsyms]                               [k] process_one_work                                                           ▒
+   95.17%     0.00%  kworker/9:4+eve  [kernel.kallsyms]                               [k] drm_fb_helper_damage_work                                                  ▒
+   95.17%     0.00%  kworker/9:4+eve  [kernel.kallsyms]                               [k] drm_fbdev_generic_helper_fb_dirty                                          ▒
+   95.17%     0.00%  kworker/9:4+eve  [kernel.kallsyms]                               [k] drm_atomic_helper_dirtyfb                                                  ▒
+   95.17%     0.00%  kworker/9:4+eve  [kernel.kallsyms]                               [k] drm_atomic_commit                                                          ▒
+   95.16%     0.00%  kworker/9:4+eve  [kernel.kallsyms]                               [k] drm_atomic_helper_commit                                                   ▒
+   95.16%     0.00%  kworker/9:4+eve  [kernel.kallsyms]                               [k] commit_tail                                                                ▒
+   95.16%     0.00%  kworker/9:4+eve  [amdgpu]                                        [k] amdgpu_dm_atomic_commit_tail                                               ▒
+   95.16%     0.00%  kworker/9:4+eve  [amdgpu]                                        [k] dc_update_planes_and_stream                                                ▒
+   95.15%     0.00%  kworker/9:4+eve  [amdgpu]                                        [k] commit_planes_for_stream                                                   ▒
+   95.13%     0.00%  kworker/9:4+eve  [amdgpu]                                        [k] dc_dmub_srv_cmd_run_list                                                   ▒
+   94.99%     0.52%  kworker/9:4+eve  [amdgpu]                                        [k] dmub_srv_wait_for_idle                                                     ▒
+   93.08%    12.96%  kworker/9:4+eve  [kernel.kallsyms]                               [k] delay_halt_mwaitx                                                          ▒
+   86.66%    80.05%  kworker/9:4+eve  [kernel.kallsyms]                               [k] delay_halt                                                                 ▒
+   63.43%     0.00%  kworker/9:4+eve  [amdgpu]                                        [k] dcn20_program_front_end_for_ctx                                            ▒
+   47.56%     0.00%  kworker/9:4+eve  [amdgpu]                                        [k] dcn20_program_pipe                                                         ▒
+   31.72%     0.00%  kworker/9:4+eve  [amdgpu]                                        [k] dmub_abm_set_level                                                         ▒
+   31.70%     0.00%  kworker/9:4+eve  [amdgpu]                                        [k] dcn20_blank_pixel_data                                                     ▒
+   31.70%     0.00%  kworker/9:4+eve  [amdgpu]                                        [k] dmub_abm_set_pipe                                                          ▒
+   15.86%     0.00%  kworker/9:4+eve  [amdgpu]                                        [k] dc_dmub_update_dirty_rect                                                  ▒
+   15.85%     0.00%  kworker/9:4+eve  [amdgpu]                                        [k] dcn20_prepare_bandwidth                                                    ▒
+   15.85%     0.00%  kworker/9:4+eve  [amdgpu]                                        [k] dcn314_update_clocks                                                       ▒
+    9.08%     0.28%  kworker/9:4+eve  [amdgpu]                                        [k] dm_read_reg_func                                                           ▒
+    6.71%     0.24%  kworker/9:4+eve  [amdgpu]                                        [k] amdgpu_cgs_read_register                                                   ▒

Happening on my device too: 13" 11th gen i5-1135G7 on BIOS version 3.17.

Unloading the xhci_pci module works, but the process appears again as soon as I load the module again.
Disconnecting the battery via BIOS for a few seconds fixes it temporarily, but it usually appears again after 1–2 days.

Hello! Any news about this issue? I have the same problem on a freshly installed kubuntu 22.04.3 LTS on a ASUS laptop with AMD Ryzen 9 7845HX and RTX 4060 GPU. Disabling the module xhci_pci works, but unfortunately in my case when I do this the keyboard becomes completely inactive. Is there a way to blacklist this module without disabling the keyboard?

What fixed the issue for me was to open up the case and unplug/replug the small motherboard pill battery. Hasn’t happened since upon connecting USB devices, which was a major trigger.

Other issues persist, like the floppy hinge of the screen (Framework dismissed this as pretty much me imagining it), and the mouse appears frozen at more or less every other time the laptop wakes up from deep sleep.

I have the early gen 11 laptop from 2021, I hope these issues aren’t affecting later ones, because frankly its both underwhelming (regarding support and their attitude to customers with actual problems) and embarrassing as a manufacturer.

have the same issue with kworker high CPU usage on Framework 13 AMD.

/sys/firmware/acpi/interrupts/gpe10:   98548     STS enabled     unmasked

Disabling gpe10 helps

echo "disable" > /sys/firmware/acpi/interrupts/gpe10

I just had this happen, out of nowhere one core was just pegged by a kworker thread and there was like 12W of battery draw (Laptop was just sitting there with 2 terminals open).

Disabling that interrupt got battery draw back to the 4ish W it was before.

Wonder what’s going on there, is that a bios bug?

Same problem on Framework 16 with the ryzen 9, same interrupt, same “fix”, see GPE10 interrupt on framework 16 causing 100% load in a single core - #2 by Mario_Limonciello

Happens for me out of the blue on my 11th gen i7, as long as bluetooth is turned off. If I turn on bluetooth, the issue goes away …

easily reproduced on the commandline:

sudo hciconfig hc0 down # one core spikes to 100%
sudo hciconfig hc0 up # spike goes away

here is an s-tui screenshot of the machine, you can see core5 spiking and then not spiking, which is where I ran hciconfig as above.

This is happening to me too, out of nowhere today, on my 11th Gen Intel i5-1135G7.
Unloading xhci_pci fixes the problem. As soon as I load it back, the kworker threads hogs the CPU and the fans go crazy.
I have an Ethernet expansion card, but it being connected or not doesn’t make a difference. Actually, disconnecting every expansion card doesn’t make a difference.
Rebooting doesn’t change anything. I haven’t tried removing the BIOS battery though.
Every time I load back xhci_pci, I get the same message about over current condition in dmesg:

[  957.752818] xhci_hcd 0000:00:0d.0: xHCI Host Controller
[  957.752825] xhci_hcd 0000:00:0d.0: new USB bus registered, assigned bus number 1
[  957.753888] xhci_hcd 0000:00:0d.0: hcc params 0x20007fc1 hci version 0x120 quirks 0x0000000200009810
[  957.754681] xhci_hcd 0000:00:0d.0: xHCI Host Controller
[  957.754685] xhci_hcd 0000:00:0d.0: new USB bus registered, assigned bus number 2
[  957.754687] xhci_hcd 0000:00:0d.0: Host supports USB 3.1 Enhanced SuperSpeed
[  957.754743] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 6.09
[  957.754745] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[  957.754747] usb usb1: Product: xHCI Host Controller
[  957.754747] usb usb1: Manufacturer: Linux 6.9.2-arch1-1 xhci-hcd
[  957.754748] usb usb1: SerialNumber: 0000:00:0d.0
[  957.754827] hub 1-0:1.0: USB hub found
[  957.754837] hub 1-0:1.0: 1 port detected
[  957.754936] usb usb2: New USB device found, idVendor=1d6b, idProduct=0003, bcdDevice= 6.09
[  957.754937] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[  957.754938] usb usb2: Product: xHCI Host Controller
[  957.754939] usb usb2: Manufacturer: Linux 6.9.2-arch1-1 xhci-hcd
[  957.754940] usb usb2: SerialNumber: 0000:00:0d.0
[  957.754995] hub 2-0:1.0: USB hub found
[  957.755005] hub 2-0:1.0: 4 ports detected
[  957.756456] xhci_hcd 0000:00:14.0: xHCI Host Controller
[  957.756462] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 3
[  957.757607] xhci_hcd 0000:00:14.0: hcc params 0x20007fc1 hci version 0x120 quirks 0x0000000200009810
[  957.758685] xhci_hcd 0000:00:14.0: xHCI Host Controller
[  957.758689] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 4
[  957.758692] xhci_hcd 0000:00:14.0: Host supports USB 3.1 Enhanced SuperSpeed
[  957.758760] usb usb3: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 6.09
[  957.758762] usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[  957.758763] usb usb3: Product: xHCI Host Controller
[  957.758764] usb usb3: Manufacturer: Linux 6.9.2-arch1-1 xhci-hcd
[  957.758765] usb usb3: SerialNumber: 0000:00:14.0
[  957.758873] hub 3-0:1.0: USB hub found
[  957.758912] hub 3-0:1.0: 12 ports detected
[  957.761470] usb usb4: New USB device found, idVendor=1d6b, idProduct=0003, bcdDevice= 6.09
[  957.761472] usb usb4: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[  957.761473] usb usb4: Product: xHCI Host Controller
[  957.761474] usb usb4: Manufacturer: Linux 6.9.2-arch1-1 xhci-hcd
[  957.761475] usb usb4: SerialNumber: 0000:00:14.0
[  957.761550] hub 4-0:1.0: USB hub found
[  957.761581] hub 4-0:1.0: 4 ports detected
[  958.027690] usb usb3-port1: over-current condition
[  958.161187] usb usb3-port3: over-current condition
[  958.294345] usb usb3-port4: over-current condition
[  958.427766] usb usb3-port6: over-current condition
[  958.551046] usb 3-9: new full-speed USB device number 2 using xhci_hcd
[  958.693477] usb 3-9: New USB device found, idVendor=27c6, idProduct=609c, bcdDevice= 1.00
[  958.693493] usb 3-9: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[  958.693497] usb 3-9: Product: Goodix USB2.0 MISC
[  958.693500] usb 3-9: Manufacturer: Goodix Technology Co., Ltd.
[  958.693502] usb 3-9: SerialNumber: UID85FA42A7_XXXX_MOC_B0
[  958.820974] usb 3-10: new full-speed USB device number 3 using xhci_hcd
[  958.964175] usb 3-10: New USB device found, idVendor=8087, idProduct=0032, bcdDevice= 0.00
[  958.964185] usb 3-10: New USB device strings: Mfr=0, Product=0, SerialNumber=0

Removing briefly the BIOS battery is the only fix that ever worked for me. Hasn’t happened again ever since.