This is the GPE that other people have complained maybe it’s this?
This thread has the patches and tool I talked about for getting EC console log.
Ok, given I don’t have the mental capacity to handle the EC tool scenario and I only want to change one thing at a time in the spirit of the scientific method I decided to try the 3.03b, will report back should the laggyness emerge.
Still using Beta Bios 3.03b, Linux 6.7.9-200.fc39.x86_64 on Fedora 39, Gnome 45.5 I got the laggyness again.
I had just rebooted after a “Software Update” and was simply web browsing using Firefox, the laptop was charging from about 70%.
Here is the s-tui screenshots a minute apart whilst the laggyness is happening.
I plan to go on the Fedora 40 Beta once that is released, its pretty soon, but I haven’t got my head around the Fedora Schedule as it looks like the Beta date has been moved a few times.
For reference this is what s-tui looks like when its running normally:
Ciao everyone,
as the OP of this thread I think it should be renamed as it’s now being used to discuss issues of stuck cores which, as it later turned out, were different than my root cause (which I have identified here in a 3rd party app: [TRACKING] [FW 13 AMD 7840U] Did kernels >=6.6.7 make your system/touchpad/mouse laggy? - #18 by fw13amd)
It would be a pity to lose the history that built up over time, so I guess that keeping the discussion going on here is the best course of action and also avoids wasting precious MOD time to move messages to another thread.
—
@Matt_Hartley I would just kindly ask, if possible, to copy paste this message at the beginning of my OP, for new visitors/Googlers to see, and rename to somethink like [FW 13 AMD 7840U] Cores stuck at low frequency and system lag
. Thanks!
Checking in.
It’s been a week and I’ve upgraded to Fedora 40 Beta with 6.8 kernel (Linux 6.8.1-300.fc40.x86_64) on Friday 22nd (a bit early, but honestly Fedora Betas are more stable that most production OSes).
No sign of the “lagging”, as I think about it I wonder if its related to “Suspend”. The only occassion I see Suspend happen is when I haven’t explicitly triggered it is after a reboot and I haven’t logged in… I’ve disabled close lid suspend.
So potentially the issue is caused by a post unsuspeded state after I’ve absent mindedly logged in… not the greatest scientific data, apologies.
First instance of the “laggyness” since Fedora 40 beta installed.
It happened after leaving the machine on over night (which I don’t normally do) and apparently going to XKCDs “Machine” in Firefox
Here is a screenshot I managed to get in the lag:
Linux 6.8.4-300.fc40.x86_64
Immediately post install of the 3.05 bios I had the “lagging”.
I’m going to do a fresh install of Fedora 40 once that is out of the beta.
My concern is some of the recommended tweaks for the things like sg_display and mesa drivers are too many variables to make for a fair clean reproducible test.
I try not to get too gnarley with my linux configs & tweaks, and I do like to track them, the only other slightly off piste thing I do is configure the laptop not to sleep when closing the lid which is controlled here /etc/systemd/login.conf I can’t imagine that has any impact, but given the tapestry of libraries and configs I can’t be sure.
For completeness:
Regarding the CPU getting stuck at 0.54GHz I do believe it’s related to ppd. I have switched to auto-cpufreq and haven’t seen it re-occur.
This behavior was sort of fixed for me for a while, but it’s back now (or at least a similar behavior). Frequency is not hard capped this time, but will stay below 1 GHz generally, with short spikes to up to 1.7 GHz. Power as measured by ryzenadj does not exceed 11.7 W. I’m using ryzenadj to monitor only, I’m not actually tweaking anything. Needless to say, performance is badly capped in this state.
It’s difficult for me to pinpoint which change brought this behavior back. I am on OpenSUSE Tumbleweed, so I get regular kernel updates. I also moved to PPD not so long ago and upgraded the FW. PPD definitely had a beneficial effect on battery life and in general works well. The system only seems to gets stuck in this low-power state after suspend-resume cycles.
To get the system unstuck, I tried changing the power profile, stopping or re-starting the PPD service, setting scaling_governor (this used to fix it for me) or energy_performance_preference manually, plugging or un-plugging power supply, suspending and resuming (both on AC and on battery), to no avail.
This is and AMD FW13 (7840U) on Kernel 6.8.8, power-profiles-deamon 0.21, FW 3.05.
Any hints on what to try to get the system unstuck? I know a reboot will fix it, but that’s not an acceptable solution while working. Hibernating and waking the system back up works to fix the frequency scaling, but then Wifi is broken…
I’m using ryzenadj to monitor only, I’m not actually tweaking anything. Needless to say, performance is badly capped in this state.
Can you specifically try comparing /sys/kernel/debug/amd_pmf/current_power_limits
in a failure vs non failure? That would help confirm if there is an EC bug with a power limiter.
It’s difficult for me to pinpoint which change brought this behavior back.
I think a likely cause is the EC triggering thermal throttling; but I don’t have a good way to prove that by reading any registers or so. Maybe if you can monitor the EC debug log in one tab you can check the last messages it emits?
To get the system unstuck, I tried changing the power profile, stopping or re-starting the PPD service, setting scaling_governor (this used to fix it for me) or energy_performance_preference manually, plugging or un-plugging power supply, suspending and resuming (both on AC and on battery), to no avail.
None of those things working really does make me suspect thermal throttling by the EC as well…
By chance - did you happen to unplug the power adapter while in suspend when this issue happened? I’m aware of a bug report in kernel bugzilla with another manufacturer that has a bug with this. It LOOKS like a thermal event sequencing problem with that manufacturer, but if you can confirm the same thing is happening on your Framework 13 that would be a really interesting data point.
Any hints on what to try to get the system unstuck?
If it’s the same thing as that other manufacturer and caused by power adapter changes while in suspend, plug in and then unplug the power adapter after you’ve resumed. See if that brings it back to normal.
Thanks for your quick reply! My system is unstuck again for now (hibernate / resume / unload wifi kernel module / reinsert kernel module), but I’ll check these as soon as I get a chance.
I don’t remember for sure, but this might be the case. I’ll try that this weekend, to see if I can reproduce this.
A note to future me, I did a total vanilla Fedora 40 install on Sat 4th May, and currently no driver, kernel or config shenanigans…
So far, I haven’t been able to reproduce the behavior.
I’m pretty sure the power limits shown by ryzenadj were not lowered when the system was stuck. The power consumption stayed way below the limits I could see (was running watch -n 1 ryzenadj
in a terminal while the system was stuck. I didn’t take a screenshot though).
OK, so after 4 or 5 days or so in and out of sleep on a completely vanilla fresh fedora 40 install I got the “lagging” again.
So I really need to get this diagnosed as I feel it’s either “just” me, or this is being reported with different symptoms elsewhere.
The next time I’ll be sure to get s-tui screenshot.
Is there anything else I can do, any fresh ideas?
FWIW I don’t think it’s just you. I think it’s probably your combination of devices/chargers triggering a bug somewhere. My educated guess from these kinds of bugs is that it’s most likely in the EC or PD controller.
Note down EXACT order of events and what devices caused it. Did you have it plugged in before suspend, did you unplug during suspend? How did you wake it? Did you have a dock connected, is it tied to that?
If you can reproduce it at will with a sequence of events and devices then it’s more likely Framweork support can too and then they can capture debug information to fix it!
Me again, been a while. Thanks Mario, I wish I could identify a logical series of steps to recreate, its maddening.
Most of the time I am plugged into a UGREEN 65W charger, but sometimes roam, and the issue still occurs (only in Gnome?, see below)
Curious series of events recently.
Been having a play with Cosmic DE on Fedora 40, the same install as above. Cosmic DE installed via the COPR, nothing too dramatic. (Still using gdm)
Flipping back to Gnome and the stuttering/lag started again (almost immediately), with, again, nothing terribly exciting happening.
Once it happened in Gnome for the 3rd time, I managed to switch back to Cosmic DE and the stuttering/lag remained happening (including gdm) telling me its a fundamental system level funkyness that’s triggered.
The stuttering/lag has yet to begin on Cosmic DE, I’ll be keeping to Cosmic DE for a while (as I am liking the whole proposition).
Will keep me (and you) posted.
Hi, so I have had the 544 MHz CPU lock just this afternoon and I cannot reproduce it now, with Framework 13, AMD Ryzen 7840U. No dock, just the charger.
BIOS 3.05
Arch Linux, kernel 6.10.3-arch1-2, Sway WM
Charger: Vention FEDW0-EU (65W)
My memory is not very good here, but the timeline supported by kernel logs and charging data from upower is this:
- Aug 09 13:46:02 system boot
- total of 14 other suspend entry/exit in between
- Aug 12 16:38:56 wake from suspend (lid opened)
- Aug 12 16:39:03 system detected a charging cable – shows as an error
ucsi_acpi USBC000:00: GET_CABLE_PROPERTY failed (-5)
- Aug 12 17:58:56 Lid closed → sleep
- (I think) I pulled the cable out, but it is also possible that it was out shortly before sleep. Last charging entry is 17:58:46.
- Aug 12 18:02:41 Lid opened → resume
- laptop is slower, laggier, CPU frequency stuck at 544 MHz is displayed in
cpupower frequency-info
, switching PPD to power_saver, or balanced, does not help. Playing a HW-accelerated video in Firefox works fine, though. - Aug 12 20:46:47 Lid closed → sleep
- Aug 12 20:55:01 Lid opened – laptop is still slow
- Aug 12 21:28:01 Connected charging cable → unlocked the CPU frequency.
The problem is that it only happens very very rarely now, I cannot remember the last occurrence, and I could not collect any EC logs from this time, as my system had the security=lockdown
set.
I suspect it happened from unplugging the cable during sleep. There seems to be something that we t wrong there. Try specifically to reproduce it using that action and the real Framework charger if you can.
I will try with FW charger but it will have to wait until next week. Also I am unable to replicate this right now, I have tried at least 8 times already, with different timing.
It is true that I primarily use a Dell TB4 dock at home, which normally charges my laptop, and normally I have the FW charger with me, but right now I cannot get back to it