CPU downclocks to 400MHz at seemingly random after some uptime

I’ve owned my Framework 13 for about two months now, and i’ve been experiencing this issue for as long as i’ve had it nearly. For context, i have the Ryzen 5 7640U model, with 32GB of Corsair Vengance 4800MHz CL40 DDR5 memory and i’m currently running Fedora Workstation 41 fully up-to-date.

Occasionally, after some uptime- and seemingly not coinciding remotely with any increased thermal activity or load- the CPU will downclock to 400MHz. This also seems to be regardless of OS install, installed applications, or anything of the like. This drop in performance renders it nearly unusable, and it takes great effort just to run a few commands for some troubleshooting results.

This presented about 6 days after i first got the memory i needed to set the laptop up, where i initially prescribed it as a failing NVMe drive since i had cannibalised an ADATA 1tb drive, and i was fully aware ADATA has horrendous fail rates and i hadnt troubleshot it further- i simply replaced the drive with a 512gb drive that i had already used in my previous laptop extensively that i knew had nowhere near surpassed its TBW rating.

It took about two weeks for the issue to pop up again, but when it did, it did so consistently within about an hour and a half or less of uptime, and this was when i decided to troubleshoot more. My first hunches were thermal throttling, so i tried to check the temps which i failed to do through linux [but only because i can’t read; i’m fairly confident i’ve actually got them just fine. see attachments.] and the power draw for the components [if it was possible at all], but what i DID find was that running

$ cat /proc/cpuinfo | grep “MHz”

after it entered its unstable state gave me a shocking revelation; most of the CPU cores save for 4 of them had downclocked to exactly 400.000MHz clean, one to 500-ish MHz, the rest to 1500-1600MHz and only one to about 4000-ish MHz. This result has been consistent, every single time, across three installs of Fedora, and one install of Debian.

It seems to be more random now, sometimes taking up to several hours of uptime to kick in [it just did it at about 4 hours]. It also just takes breaks for… however long, sometimes days, where it does nothing. However i can’t seem to tell if that’s the issue being ‘temporarily fixed’ by something i tried, or if it’s just decided not to replicate.

I have contacted Framework support, and i do have steps to continue troubleshooting this going forward, but i figured some community input as to why it might be downclocking certainly couldn’t hurt. It’s not thermals, it’s not the Linux kernel, it’s not my applications, it’s not my drivers, it’s not specific uptime… I’m lost, honestly. For all i know this might just be an honest board or chip defect- and one that, though i thank their thoroughness, probably isn’t going to be solved by sending them pictures of my board for a visual check.

I plan on also troubleshooting the RAM, removing it one stick at a time and if possible replacing it with another/another two DDR5 SODIMM if i can find any. I’ll try to report back on that when i can test it. Might test Windows too, see if it somehow has an impact. [On that note if anyone can recommend me any decent software for monitoring this stuff on Windows, it would be greatly appreciated, any memory i have of that kind of diagnostic work on windows has long since been shot dead by Linux.]

lm_sensors output Pt. 1:

lm_sensors output Pt. 2:

cat /proc/cpuinfo output:

[Scuse the crappy photos; i was not about to try and get screenshots of this to upload while THIS was happening]

Edit: I only just noticed that in this ALL of the cores were downclocked to 400-500mhz. I haphazardly took this while i was busy typing the original post on a different machine, not realising the output was… so much worse than the other times i’ve checked it, holy shit.

This happened to me on a 12th gen intel FW13.

Check if BDPROCHOT is triggering:

It’s possible that there’s a hardware problem if /usr/bin/sensors shows low temperature while that hot processor flag is set.

On my motherboard, BDPROCHOT couldn’t be disabled, so I had to have a new board mailed to me after talking to support about it.

What i can confirm is trying to run that command

sudo wrmsr 0x1FC 2

gives the error “wrmsr: CPU 0 cannot set MSR 0x000001fc to 0x0000000000000002”. I assume that, also, means it cannot be disabled [or that doing so is a completely different process or instruction, on an AMD chip.]

As it stands i’m running the stress test and monitoring both the CPU frequency and temp with watch sensors and watch grep '^[c]pu MHz' /proc/cpuinfo, where actually, despite no downclocking yet, the CPU does read at 99.8c. I’ll report back if the performance tanks again, and if the temp does at the same time

What i’ve discovered is that whatever was reading at 99.8c is probably not the CPU- there’s the two different CPU temp readings, and the 99.8c one seems to jump immediately to that, and then immediately down, with zero windup or cooldown, when the stress test is no longer running.
More likely the peak of the stress test i tried put the CPU at only about 52c consistently with very brief peaks of 53, which, if this is a thermal issue, is most assuredly not enough to replicate the issue.

I ran the test twice for about ten minutes, the second time being when i realised that the CPU reading that peaked at 53c was likely the ACTUAL CPU temp sensor. Neither time produced any results. So my plan is to just run watch grep '^[c]pu MHz' /proc/cpuinfo and watch sensors again just while the laptop is in casual use, and hopefully find a way to output each output to a singular text file so i can better get an idea of what happens at what time.

Oh, i had this too at least one time.
I thought the problem was triggered cause the laptop was in standby (standby is sometimes really strange with the amd fw13 and linux xD)

After i rebooted it everything went back to normal, but reading your posts… i keep an eye in that problem ^^