[TRACKING] [FW 13 AMD 7840U] Cores stuck at low frequency and system lag

Just tested a few more times after switching back to power-profiles-daemon-0.13-2.fc39.x86_64 so I can control the setting manually and I don’t think it is related to the epp setting.

base state - platform_profile: balanced, scaling_governor: powersave, epp: power - all cores can reach 4000+MHz. Kernel is 6.6.8-200.fc39.x86_64

  • Test 1: reboot, still on power, all cores can reach 4000+MHz. epp starts out set to performance. set it back to power
  • Test 2: reboot, on battery, cores 8-11 stuck at 1666MHz. Changing epp, platform_profile has no effect. Setting scaling_governor to performance lets all cores reach 4000+MHz. Set scaling_governor back to powersave and cores continue working normally.
  • Test 3: reboot, on battery (epp was performance at time of shutdown) - all cores end up stuck at 544MHz. Toggle scaling_governor from powersave to performance and back to unlock core frequencies.

So far, the only thing I find that makes a difference is whether I’m on battery or not during boot (even this, I think I have not tested enough to prove), but at least epp does not seem to make a difference.

Note: to toggle the scaling_governor: sudo cpupower frequency-set -g performance && sudo cpupower frequency-set -g powersave

I’ll try to test some more this week.

@fw13amd were you able to determine if slow clock speeds are the cause of the lag you experience? I didn’t mean to highjack your thread…

1 Like

Ciao @bikefrivolously ,

no worries, any contribution is welcome :slight_smile:

On my end I tested once again yesterday, with a starting condition of TW running 6.6.6 + PPD patch by (super) Mario, a combo that proved its worth for several days prior.

Therefore I tried to take the distro update again (to 6.6.7 plus various other firmware packages), and it seemed all right for my whole working day, but I guess it’s just because I work on AC power. As soon as I took it off the adapter, the touchpad became sluggish again. So that update on my system is a no-go, irrespective of whether I run Mario’s PPD.

I kept an eye on KDE’s System Monitor, and couldn’t observe any cores being stuck (some had a certain tendency to rest more on the 400Mhz/base freq, but eventually even those spiked up to higher values).

For this reason, I think that my problem has a different root cause than the one you reported about frequencies being stuck.

And at this point I dunno if my problem is distro-related (although I noticed the very same symptoms on F39 running 6.6.8, so I suspect that even if my problem is NOT caused by cores being stuck, there is still something fishy in 6.6.7 and/or 6.6.8 that wasn’t there in 6.6.6).

But I think that we are dealing with two distinct problems here :frowning:

Update: today I saw that 6.6.9 was out for TW, so I gave it another try.

I’m happy to say that the touchpad is back to normal (need to test a bit more, but hasn’t shown any stuttering on battery so far…). So I think that the problem came from some package of my distro and seems to be fixed now.

I’m also keeping Mario’s PPD patch and haven’t noticed any issues with it so far.

4 Likes

As of right now I’m on kernel 6.6.9 in Fedora 39 Gnome and haven’t experienced any touchpad or system instability whatsoever.

1 Like

21 hours later and we’re still good, let us know if that helps out you on TW or Fed

Tagging as solved since it’s now a non-issue for me.

@bikefrivolously it’s indeed better for you to open a dedicated thread at this point, if you’re still experiencing that issue.

I read something similar from a Windows user whose cores where stuck at 513Mhz IIRC (until a reboot), running the latest BIOS v03.03. You should find it here:

Thread’s a bit old tho (November).

Wish you luck! :slight_smile:

So, in a somewhat unexpected plot twist, I got the problem again after getting some more distro updates.

Eventually I figured out the culprit: Insync, the app I use to sync my Google Drive on Linux, was broken from a certain opensuse TW snapshot onwards, probably some kind of dependency problem.

To workaround that problem, I would install a different version of this package to play nice with the latest dependencies that came with more recent snapshots, and that package seems to eat up quite some CPU to the point where the cursor becomes laggy. It has nothing to do with the kernel tho, that was just a wrong deduction on my side.

I’ve raised this concern in the insync forums, fingers crossed.

EDIT: it’s about kworkers eating up a lot of CPU or I/O while doing an Insync sync on my home partition which is a Luks2 encrypted btrfs.

The app itself might be abusing I/O, but it may also be linked to btrfs not playing nice with encryption.

Thread here:

1 Like

I am observing the exact same behavior. Opensuse TW with current kernel as of today (6.6.11-1-default). My system had become almost unusable. Thanks for the workaround by toggling the scaling governor! This unlocks the system for me.

1 Like

I haven’t been paying enough attention, but occassionally the laptop becomes very “laggy”, this is tryicky to describe, the mouse will go super unresponsive and window animations won’t animate and today I was on a web based video conference call and the audio buzzed as it fell apart.

I think its the issue descibes above, I’m running Fedora 39 with Linux 6.7.7-200.fc39.x86_64 kernel, no nonsense as vanilla as it gets.

Reading above I can’t quite work out how to tell if the cpu has decided to go into a power save or cap the mhz to 500, is there a command?

Given the feedback above people are relating it to going into and coming out of sleep, but for me it just happens whilst I am using the laptop.

Any advice would be great, I’m tempted to go onto Fedora 40 now to get 6.8 and hope that magics a solution, that seems a little too much… Ideally I’d like to diagnose it first.

Update:

Now I’ve looked at s-tui I can see its pretty amazing, will try this next time it goes “laggy” and report back:

1 Like

This is going to be out best bet to spot where it’s happening.

1 Like

I appreciate this is a reddit post, but it seems people with the Steam Deck, which I understand to be similar hardware also have experienced the CPU locking to 400Mhz:

https://www.reddit.com/r/SteamDeck/comments/v0gwiq/cpu_throttling_to_400mhz_and_not_resolving/?rdt=47784

Update on my end, no issues so far, but I thought the above might be interesting.

CPU throttling is triggered by the EC, it is not equatable across vendors or even models within a vendor.

Ok, so the lagging started again, and I got these screenshots from s-tui suggesting its not the mhz cap.

The top graphs were going red, but not all the time… any advice?


See if the GPEs are going up excessively while it happens (/sys/firmware/interrupts). If so it would suggest that there is a sensor the EC is reading that is high and while it’s high the EC is asserting the SCI so frequently it slows the machine down.

And by chance does this happen with a dock recently connected or disconnected? There is another thread that has found a correlation to excessive GPE with a dock connect/disconnect.

I do not use a dock.

Now I am making a mental note of when it occurs, it could be soon after plugging in a usb-c power supply of which I use 2 in different locations, a UGREEN 65W USB C Charger Plug 2-Port GaN Type C and an ASUS USB C that came with an ASUS ZenBook.

The other result of the lagging is after the reboot, the screen brightness is set at its lowest. This could be a result of the auto brightness not take effect during the “laggy” period and its just a cumulative result of not being able to be set or something else.

Anything else you can think of I can attempt to run during a “laggy” phase to help diagnose?

You can add the patches for the cros-ec driver and then use the EC tool to get an EC console log when this occurs and share that with Framework support.

Can you direct me to where to find instructions for that.

Also, this morning Fedora 39 upgraded to Linux 6.7.9-200.fc39.x86_64, I was on 6.7.7 before when I experienced the lagging. So have experienced the lagging on both kernel versions at least.

If you haven’t already I suggest installing the bios 3.03b which has an updated EC. Its conceivable to be a similar manifestation as the DPC violation in windows which was fixed in 3.03b.

This is the GPE that other people have complained maybe it’s this?