Just use tweezers to re-seat or move the pads gently. You’ll be fine. They dont fall apart like some do.
If the pad is still completely intact (none came away with the heatsink) they’re fine.
How did you increase the power limit to ~38 W? I installed PTM on my 7840u and it still maxes out at 28W, even when increasing the max limit to 40W using Universal Tuning Utility I wasn’t able to push it beyond the max rated TDP of 28W.
ryzenadj lets me set it to 42, though due to some unknown other throttling (might be vrms or skin temperature or someting, none of the sensors I can see look problematic) it sometimes does gradually decrease down to 30 sometimes.
Without any adjustments, simply plugging into power gives me more than 28 so not sure what’s going on in your case.
Did you use HWINFO to read package power? AMD Adrenalin also reports 28W, wondering if AMD software is what’s limiting TDP.
Happy new year! First off, some good news
I have data that confirms there has been none to a negligible drop in performance of my PTM7950 application after ~17 months of continuous use.
I finally got around to this, and swapped my Intel i7-1165G7 board back in to retest. The board with PTM7950 has been in use from ~July 2022 to early November 2023, so well over a year. Initially, I thought I had cleaned out all the dust (there was a lot), ran all the tests in my original post again, and was seeing over ~5C hotter temps (10C at the extreme, at high power). I was puzzled, and thought my PTM7950 application had degraded. Then, I rechecked by partially taking off the fan cover, and lo and behold there was more dust. I cleaned that off, reran the tests, and temps were back to how they were originally!
The area in the green boxes [a] are easy to see and clean dust from. However, the area in the red box [b] is hard to get to and clean:
And I did not want to remove the entire heatsink for the risk of damaging my precious PTM7950 application. So to avoid that, I took out the entire mainboard, unscrewed these 2 screws (yellow boxes [s]) on the front side:
as well as these 3 screws (yellow boxes [s]) on the back side. Afterwards, the area in the blue box [bmc] is still attached to the heatsink via a bent metal clasp? which didn’t look like it could be easily undone without breaking:
I kept the rest of the heatsink assembly screwed in/attached to the mainboard.
Finally, with all those screws undone, the fan can be gently pushed out enough so that the bottom of that hard to reach area (red box [b]) is revealed:
If you zoom in, you can see that there’s dust blocking the fins and yes, it doesn’t look like much. But little amount was causing an increase of 5-10C! I had already removed a ton of dust from the green boxes, so my temps must’ve been even much higher. I definitely noticed an increase in fan load/noise during regular usage (but let it be), and am confident it was the dust. Now I wonder if the increased fan load led to a decrease in battery life because:
After I cleaned that area (very easy to do with a cotton swab and some alcohol, or carefully with water) and reassembled, temps dropped back to how they originally were on day 1 PMT7950. Yay!
For brevity, here’s the previous stress-ng
test on July 2022 (room thermostat said 73)
compared to more than 19 months later on December 2023 (room thermostat said 71F FWIW):
With only the green boxes cleaned and not the red box (not fully cleaned):
CPU temps were ~82C, so ~6C hotter.
After the red box was cleaned so that the fans and heatsink are fully cleaned:
CPU temps went back down to ~77C, so ~1C hotter (probably within the margin of error).
Regarding my AMD 7840U upgrade, I’ve been using the board since early November and have ran it through a gamut of tasks. The fans rarely (almost never) spin up during lighter tasks like web browsing, video watching, and programming work, at least in Fedora. I tried gaming for a bit, and they ran surprisingly to shockingly well (especially with AMD FSR upscaling to a 3440x1440@60Hz monitor; it’s a gamechanger, pun intended. I consider 60FPS+ “playable”). I tried Halo Infinite multiplayer (big team battle), Ghostrunner 1 (played through the entire game) and 2, Switch emulation. Anyways, less about gaming performance and more about temps: it looked like temps were capping at ~80C GPU and a bit less on CPU. Fans were max, and fairly loud but not annoying and the chassis got hot, but never uncomfortably hot to use. I’m also wary of a hot chassis that would degrade battery longevity. Gaming on just the built-in keyboard and external mouse was totally fine!
Interestingly, if I set RADEON_POWER_PROFILE_ON_AC=high
it will keep the GPU clocked to 2800Mhz and the GPU temperature will steadily rise to 104C in Halo Infinite and then the system will instantly shut off (likely because of some thermal limit safety feature).
This is to say that the stock 7840U thermals are I’d say is very good already. However, I noticed that if I ran e.g. 3DMark Time Spy with the top input cover off, I’d get about 100 more points, so at least on my system, there’s room for improvement. I was only able to reach the same score (3150-3160) as here with the cover off.
^ I think it was! With my input cover on (so regular laptop use), I was only getting a score of ~3050. Not a big difference, but it shows there’s room for improvement from stock.
Now that I have a good idea of how that performs, I’ll be switching to PTM7950 (a huge thanks again to those that posted their results) — I’m confident that the temps will be even better and may allow some more performance out of the machine.
Unfortunately, I’ll have to wait before I make any modifications due to this issue I’m experiencing. I don’t want to add any confounding variables or warranty troubles for Framework support.
@VeryRoomy / @Adrian_Joachim wrt power limits, this is apparently where it’s set in the EC if I’m understanding correctly. Referencing it here with a here be dragons and all the appropriate warnings with messing with that
Really happy to see my benchmarks proved useful to someone. Few more notes to help replicate (and possibly surpass) my results, the specific laptop cooler I used was a thermaltake massive TM. My hypothesis is that not only does it force more airflow to go through but also the airflow through the bottom vents and at the heat pipes, which might be why I did see a performance boost with it. Also, I did use tiny 11 which, compared to my bloated windows 11 install, created a massive performance upgrade. I also had the laptop on a 100w charger which did help too. I might have to try the PTM7950 on my 7840u and rerun the benchmarks to see if it makes a difference.
@BoredFish thank you! I continually look at your thread as a reference and it’s saved me a ton of time
Interesting, I think that makes sense. Btw, my Fedora installation is “heavy” (it’s what I use for everything, work included) but the Windows installation I do my benchmarks on is stock Win 10 with only the essentials like HWinfo/Steam/3DMark/Cinebench, nothing much running in the background. I could not reach your Cinebench numbers (perhaps stock Win 10 vs tiny Win 11). I did also test through my monitor’s 90W power delivery and an Anker 747 that does up to 100W on a port. Didn’t notice a difference between those two, but maybe there would be with extra thermal room, and I think there likely would vs. a 60W charger.
Edit: whoops just kidding, you didn’t have Cinebench numbers haha
THATS the benchmark I’m forgetting, need to run that. I’ll test some more tonight, but I think the 100w charger gives some power headroom so that the battery can charge while the laptop is going full tilt
Quite late replying, sorry. Unfortunately, I lost the motivation to continue testing with it, especially since I didn’t do enough due diligence in testing beforehand making it hard to be sure my tests before were correct/accurate, and harder to replicate them, so I’ve pretty much given up on comparing.
No worries! I haven’t installed PTM7950 on my AMD 7840U yet but will at some point (might be a while) and plan on doing the due diligence when I do.
Is there anything I should check for? It would be nice to standardize as well. Off the top of my head, a combination of tests that would cover both Linux and Windows:
- Cinebench 2024 (temps and score)
- 3D Mark Time Spy, Night Raid, and Fire Strike (temps and score)
- Unigine Superposition Benchmark (temps and score)
- A max CPU temp test with
stress-ng
and Prime95 - A max GPU temp test with or Furmark
- Stress testing - ArchWiki
I might write a script for Linux, though maybe the easiest/superior option is just using the Phoronix Test Suite (Anyone have experience running that? It seems like it’s automated and reproducible).
Passmark Performance Test as another that would be good as it’s on both Linux and Windows
Would be worth considering Cinebench R23 as well as 2024?
I want to try Phoronix Test Suite but it’s out of date on Solus so need to see about getting it updated to the latest version.
Ooo I’ve never used Passmark Performance Test, will consider.
I haven’t tried Cinebench 2024 but was thinking it could replace Cinebench R23 completely. Though thinking through this a bit more, I also don’t like how long it takes to run the default number of passes.
I’m thinking of putting together a program that can measure the affect of changing thermal materials for this thread. I think the total test would just need:
- A temperature test measured by loading the CPU at 100% (Prime95?).
- A simple CPU benchmark (y-cruncher?) that runs from:
- Idle temps, so the benchmark would include turbo boost frequencies and temps. There would be a 5 minute waiting period between each run for the system to drop down to idle temps. Or maybe 2 minutes (would be interesting if perhaps PTM results in faster temp drops).
- Post turbo boost frequencies and temps, so load the system at 100% for a minute, and then run the benchmark.
- A simple GPU benchmark (Unigine Superposition or perhaps Stable Diffusion with tinygrad?
Repeat runs 5 times for each and calculate the average and report the measurements.
Though this could evolve into its own simple opensource benchmarking program that includes:
- Scores
- Min/max temps and measured every n seconds (e.g. 10 seconds)
- Power usage for various tasks
- Average power usage over each test
- Power usage over all tests
- Power usage every n seconds (e.g. 10 seconds)
I’m tired of long benchmarks and a lack of standardization I think this could be as simple and relatively fast like a Geekbench run while providing enough insight. And would allow Framework users to run the same reproducible tests to check how their system performs and the affect a specific change has. I could probably write the entire thing in Elixir with little or no external dependencies and make it cross-platform with Burrito. Someone lmk if this would just be wasted effort, lol, and if anyone has any scripts, feel free to send 'em my way!
Especially in cooling testing they need to be somewhat long otherwise you are measuring the wrong thing. If the test is too short you have huge variability on how much the heasink is already saturated and how fast the fan is reacting and stuff and not the thermal interface itself.
For higher power tests it also can’t be too long though otherwise you start measuring vrm and skin temperature throttling that is relatively unrelated to the thermal interface too.
Really good points…hm, I wonder if it’d be sufficient to start the test when all temps from all sensors are detected as “idling” (e.g. they fluctuate no more than +/- 2C and aren’t continually trending upwards or downwards).
And then after it’s started from idle temps, stop the test at the max end when, again, temps don’t fluctuate more than e.g. +/- 2C and aren’t continually trending upwards or downwards.
That way, we only test and measure from a “true” idle to a “true” max and also gather the data from between those two points. With that full data, we can then compare against another thermal material test run and manually look for things like
and other variables like a warm chassis.
I think this would also be useful to be able to record an initial/baseline reading, and then a user could rerun the test whenever to see how temps fare and if they’ve risen. So maybe the material’s performance has degraded, or perhaps it’s time to clean the fan/heatsink of dust.
I heatsoaked with a known load to get a repeatable test point but I suppose this might work if you isolate other variables well enough.
You should also controll fan speeds if you want to actually measure the thermal interface and not just the fan profile XD
Pretty sure that’s what I meant with skin temperature.
Recording a best case completely cooled down heatsink run definitely is a valid data point, it’s just a pain in the ass to get a bunch of them XD.
Oh yeah, I’ll need to figure out a way to keep fans at 100%. Lots of variables
Ah I meant in general vs. AMD’s official skin temperature sensor (which I thought was just a single sensor and might be inaccurate). Did some quick digging and it turns out from the Ryzen Mobile 4000 series, with AMD System Temperature Tracking (STT) V2 and STAPM:
Smart Temperature Tracing v2 (or STTv2) is designed to help a system boost for longer by knowing more about the thermal profile of the device.
My [By?] placing additional thermal probes inside the system, such as on hot controllers or discrete GPUs, the readings of these can be passed through the Infinity Fabric to an embedded management controller. Through learning how the system thermals interact when different elements are loaded, the controller can determine if the system still has headroom to stay in turbo for longer than the current methodology (AMD’s Skin Temperature Aware Power Management). This means that rather than having a small number of sensors getting a single number for the temerpature of the system, AMD takes in many more values to evaluate a thermal profile of what areas of the system are affected at what point.
That’s like, a lot of extra variables to account for and maybe the cause of those oddities reported earlier, but I think it can just be treated like a black box where its effects can be deduced afterwards, so it won’t affect how to collect the data.
Yeah haha I feel that pain which is why I’m probably going to script/automate this and then dogfood that.
That bit is relatively easy, ectool lets you set the fanspeeds manually.
Pretty sure this smart thermal tracking or whatever is what is gradually turning down the power limit during longer runs. I can’t really see any sensor going particularly high using lmsensors so not sure how I would bypass it. To be fair at the minimum of 30W this puppy still has a ton of performance.
Ah ectool, thanks!
Hmm yeah, that’s my intuition as well. Though I have gotten the system to reliably thermal shutoff if I keep the GPU pegged at the max clock of 2700MHz IIRC and a sensor rises to 104C while also using the CPU (e.g. in a game). Do any of these from ryzenadj work?
-a, --stapm-limit=<u32> Sustained Power Limit - STAPM LIMIT (mW)
-e, --stapm-time=<u32> STAPM constant time (s)
--skin-temp-limit=<u32> Skin Temperature Power Limit (mW)
and perhaps the other options? I tried only briefly a while ago and I didn’t notice a difference, but might’ve been doing something wrong.
Also found this: I Successfully Disabled STAPM and Increased the Power Limit on my Matebook D!
Only skimmed it for now, but interesting. I think I saw these values set somewhere in the lotus-zephyr
branch for the EC, so I think it might be possible to change it or ideally Framework allows disabling STAPM completely (at users’ own risk).
Also semi-related lol
A bit off-topic though, maybe we should branch this into a separate thread?
All of them work but the mistery throttling will slowly ramp down STAPM to 30 after a while.
You can go 45W for quite a bit before that though.