[RESPONDED] Excessive CPU thermal(?) throttling on 12th-gen

Hi @kelnos,

Your ticket is in with the engineering escalations, I have added a note indicating that you are on the 3.08 BIOS. What games are you trying to play and with which distro/kernel?

Thank you for the update, @Matt_Hartley.

My usual game reproducers are Left 4 Dead 2 and Stellaris. Sometimes I can go an hour or so with L4D2 without seeing it occur, but with Stellaris it usually repros within 20 minutes or so. I can repro faster if the vents on the bottom of the laptop are partially blocked by my lap, though it still happens soon enough if it’s on a hard surface like a table.

These are not particularly “strenuous” games; they play just fine on my older 2018 Dell XPS 13.

I’m currently running Debian testing (trixie), kernel 6.6.13. This also occurs on Debian stable (bookworm), any kernel in the 6.2, 6.3, 6.4, 6.5, and 6.6 series, and likely older.

During previous support interactions, I’ve also reproduced this using the Ubuntu 22.04 LTS image linked from Framework’s Linux page, by creating a tmpfs volume and compiling the Linux kernel (with make -j20) over and over.

The OS doesn’t seem to matter once it’s in this state: I can reboot the laptop and enter BIOS settings or the GRUB prompt, and if it’s still throttled I can see sluggish cursor movement and screen redraw.

I’ve disabled turbo boost (both in the BIOS and in sysfs on Linux, and in tlp’s config). I’ve also experimented with throttled, limiting TDP before doing some “softer” throttling, but that hasn’t helped. I’ve also run through support’s suggestions to try removing and shuffling the RAM around, as well as removing the NVMe drive (running the OS off a USB stick) and Wifi card… plus a lot of other stuff they’d asked me to do.

Appreciate the update. We’ll need to see what the escalation team is able to sort from this. In instances where reproduction is tricky, this becomes harder.

On the 16th, we emailed you with thoughts from our engineering team. I’ve added in your latest feedback about the games and tmpfs volume into the ticket as well.

Yep, got the email, unfortunately while I was out of the country. Just got back yesterday and will try the new troubleshooting idea.

You were asking in another thread what thermal paste I used to solve this issue for me. It’s this noctua thermal paste from amazon. I took my time and was very deliberate about removing the old thermal paste. I even used a little isopropyl alcohol on a q-tip to completely remove it. Then I was pretty generous with applying the new paste.

Other than that, all I did was to use compressed air to blow out the fans. Hope this helps.

I haven’t had the opportunity to try new thermal paste, but I’ve been playing with a possible workaround, at least for gaming: restricting the GPU’s max clock frequency. This seems to be more or less working, though at the expense of worse performance in games, and the need to reduce quality settings in the games themselves.

On Linux I set the values in both /sys/class/drm/card0/gt_max_freq_mhz and /sys/class/drm/card0/gt_boost_freq_mhz to something lower. Looks like they default to 1450. I started by dropping them down to 650, and have been inching them back up, 100MHz at a time, with so far good results up to 950MHz.

My admittedly qualitative test is to stretch out on my couch, put a blanket over myself, the laptop on top of my lap with the blanket under it, and start playing Civilization VI. At the default of 1450MHz, the throttling kicks in after 10 minutes or so. At 950MHz and below I was able to play for several hours with no issues.

I’m still testing (next stop: 1050MHz), so hopefully I still have room to run it faster here. This is a decent workaround: I’m happier to be able to play games at all, vs. having to save and quit every so often when the laptop decided to misbehave (especially a problem with online multiplayer games). But it’s still pretty lame that I have to live with worse perf and lower graphics quality for this to be usable.

And this isn’t a complete workaround. As I’ve mentioned, I can trigger this just by running Linux kernel compiles at full-tilt, even when the GPU is more or less idle.

Update: 1050Mhz was too high. I did manage to play for a about 3 hours before it throttled and got stuck. Will try bumping down to 1000MHz and see if that’s stable. Otherwise it might be 950.

Update2: 1000MHz was no good either; this time it throttled after an hour and a half or so. Back to 950.