I noticed that when playing games, my CPU (or from what I understand a single Core) rockets to 100°C while only being at about 35% load. Other cores have about as much load but stay just shy of 70°C.
I used psensor for this. htop reports roughly the same numbers, but I dont have a screenshot of that. I am assuming that temp1-temp8 (which are also reported by acpi -t) are the cores and Tctl (which is also reported by sensors under k10temp) is the average or something. I am basically confused why only one core is so dramatically hot and all that while not even having any core at full load.
So my questions are: Is this normal? Is this bad? And if it is bad, what can I do about it? (Aside from the obvious don’t do this then™ )
My understanding on this topic is quite limited. But from what I could gather from other sources is that this may be a bad sensor? There isn’t much more I could find. I would like to hear your facts and opinions on this.
In this screenshot it shows the temps of the various sensors over time. The drop off was caused by me closing the game.
I know that this is technically not a Framework specific thing. Yet I only experience this on the Framework machine. And don’t know who/where to ask. Therefore I did unsurprisingly not find any information on this in the FAQ or the forum. Also:
Ok so judging by the Tctl, that would mean that its the entire Processor running at 100°C, right? But then, why is the load not 100%? Is that normal? I mean I find it weird that the CPU would run hot “without working”…
And also what are the temp sensors measuring then? Is it device specific? Can I read about that somewhere?
I only have numbers for CPU and dGPU benchmarks (no IGP), but running Cinebench R23 single core for 20 minutes had the average Tctl/Tdie temperature at 80C on my FW16. R20 was the same at 80C @ 20m, and R24 was 81C (slightly higher ambient). I would assume that CB is a heavier pure CPU load than the game you were running.
However, if you are using the IGP for gaming, there is now much more heat being generated across the CPU die and the cooler is dealing with both CPU and IGP heat. It’s entirely possible, if not likely, that you might hit 100C under such scenarios.
I would run a pure CPU single core stress test and see what your temps are. I would also try raise the back of the laptop up a couple of inches to improve air flow. If you are still hitting 100C Tctl with no IGP load, it’s possible there is an issue with the cooler, cooler mount, or fans. There is also some variance between individual CPUs due to factors like the vcore needed to hit a given clock speed, but I doubt that variance to be more than a few degrees C.
If a single core is pegged with a single-threaded load, the core will boost as high as it can while it still has thermal headroom. So you can have a situation where that core will clock high, the other cores will be more or less idle, and the CPU as a whole gets very hot, even throttles.
There are various applications that will show this. Here’s the default system monitoring application for Linux Mint on my new 7900X. It’s idle now.
I bet on such a graph you’d find one core pegged at 100% and you’ll see the scheduler move the big load around from core to core. This will heat up the CPU, even if the CPU usage doesn’t appear too high.
This is why these CPUs have a peak frequency speed rating which is lower than an all-core maximum frequency.
@Jovec I was using the dGPU. Running a single core/thread stress test with strain -t 1, strain -t 2 as well as stress stress --cpu 1 (which would move around the task) gave me 85°C at most. Raising the Laptop did probably help, but only be a few degrees. Could still be variance. Then again, I only ran each test for about 5 minutes. I stopped once the temperature didn’t rise for over 3 minutes.
From that I would conclude that the cooling system itself is probably fine?
@Fraoch Thanks for the insight! That was actually what I witnessed during the last stress test. One (switching) thread running at 100%, heating up the CPU. Yet, as I just mentioned the temperature didn’t exceed 85°C.
It seems most of my questions have been answered. And I learned some stuff to boot. To summarize (for myself, at least):
High temperature at low load is a normal thing for the CPU.
On AMD, Tctl is only showing the temperature of the hottest Core.
On AMD, the ACPI Temperatures are not the individual Cores.
My issue is probably actually related to software stuff, since limited hardware testing couldn’t reproduce the problem.
I will try to close/rename the post. But if I got something wrong or someone has something to add, please do not hesitate. Thanks again for all the info!
(I have the 13" AMD but anyway) The default fan control logic is based on a temp sensor on the motherboard near the CPU, but not the CPU temp itself. So if you have just one core pretty active, and the rest mostly idle, overall power usage won’t be very high, so overall heating of the rest of the laptop won’t be much, so the fan will stay off, while the CPU will get up to around 100°C then thermal throttle. So this is all by design, supposing most people want the fan to just stay off if possible (if the laptop won’t become uncomfortably hot for you).
But you may prefer for the fan to kick on when the CPU gets hot, even if the overall laptop isn’t hot yet, and that is possible - using ectool. There are a number of threads about this, I’m not sure if I can find the original, but here’s one of the more recent ones where I posted the values which worked for me: AMD FW13 fan behaviour and ramp-up times - #13 by pierce (and about ectool see: Exploring the Embedded Controller)
This also depends on the power supply, CPU model and cooling policy.
AMD Ryzen 7040 Series has higher energy efficiency but lower thermal conductivity compared to their Intel counterparts, so at a given load the CPU die is hotter but the radiator is cooler. To mitigate the problem, AMD made the CPU running at a constant 100C if the cooling is not enough, reducing the clock frequency to maintain lower than but close to 100C. Intel CPUs on the other hand, runs mostly at 70C~85C but if the temperature reaches 95C, PROCHOT is triggered and the clock speed goes to 400MHz and stays there until shutdown or restart.
On Intel laptops, you can read the temperatures of all the cores, on btop it shows right besides core utilization %. On AMD machines there’s no such a thing, therefore it’s not possible to accurately measure how low the thermal conductivity is, other cores might be as low as 60C when the loaded core is 100C.
On the Framework Laptop, the fan control sensor is located somewhere between the keyboard and the CPU, thus the cooling system is designed with a balance between ergonomics(low noise and keyborad not too hot) and CPU performance(max cooling). Then again, if the thermal conductivity is slow, ramping up the fan immediately upon loaded CPU is no use as the radiator is still cool at the moment.