That is very interesting and problematic. USB-C is indeed passthrough while HDMI uses USB-PD and DP Alt Mode. It seems likely this will need a mainboard swap, but I will let @Kieran_Levin provide thoughts on this as well.
I am also encountering this problem where it drops to ~400 MHz. I am running dual 4Ks connected with the DP expansion cards on the left side of the laptop. On the right side I have the USB-C for charging + USB-A for a hub which has my keyboard, mouse, camera and microphone.
I have the machine loaded up with 32 GB RAM and am using VSCode + MS Edge for web app development. All of my heaving lifting is done on a remote docker container so I don’t think I am over-taxing the machine. I am basically using VSCode as a editor frontend and then of course the browser. No compilation is done on the machine.
This normally runs okay but when I do a Zoom video + screen share on one of the 4Ks, the machine regularly goes to 100C and then throttles down to 400 MHz. These Zoom screenshare sessions are typically 8 hours a day as pair programming is the normal way of business for the place I work.
Am I driving the machine too hard or is there a legitimate thermal issue? I did see talk about a different thermal paste. Would that remedy this issue? Is there some other thing that could improve this?
I really love the laptop and the concept here, but it’s real difficult when the CPU clocks way down and the whole system drags for a min or two.
Thanks!
Hi @Jordan yes, that one sounds like it could be the thermal paste issue. We’re getting in a batch of thermal paste syringes with the new paste that we can send out. Send an email over to the support team with your system serial and shipping address, and we’ll mail you one.
@Adam_Hupp since you are only experiencing this on one port this does sound like a potential hardware issue. Two additional things to try are.
- Unplug the charger and power off the system, wait for 35 seconds and the power up the system again.
- try flipping your charger cable to see if this is only an issue with one side of the connector. You could also do the same with your hdmi card by plugging in the type-c card in this port and then plugging the HDMI card upside down.
@nrp I did contact support and they put in a order for me for new thermal paste. However, it has been a week or so and it hasn’t shipped out, so I ended up buying some GELID GC-EXTREME thermal compound and made sure to coat the CPU/die completely. It did definitely lower the temperatures, maybe on average 15-20°C but it is still overheating and now even throttling to 199 MHz. This happens very regularly throughout the day when I am using it.
What can I do? It makes my entire workflow screeches to a halt while it cools off.
Forgive me if I am repeating something that was already suggested.
Most of the threads I have seen referring to running at these reduced clock speeds are caused by an issue where the foil cover for one of the expansion bays is contacting something it shouldn’t and causing a short that shouldn’t exist, and the throttling down of the system is a self-protection method.
If it is indeed that issue, please see this forum post in the “An Expansion Card bay isn’t working” section: Known issues on early Framework Laptops
The fix is described here: https://knowledgebase.frame.work/en_us/one-port-on-my-laptop-does-not-function-correctly-r1mqMnTet
Hi @Jordan sorry about the delay. We are just getting the thermal paste through our inventory process for shipping. With the updated thermal paste you applied, are you seeing temperatures hit 100C? If not, it’s possible the throttling is being caused by something intermittent that is separate from the original thermal paste issue or thermals generally.
@nrp It is reaching 95°C. I looked at the An Expansion Card bay on my laptop does not function correctly (frame.work) guide and opened up the machine. 3 of the 4 seem to have this problem:
The one that seems okay:
The incorrectly installed ones:
The big problem now is that when I tried to move/adjust the last sticker, it ripped in half:
What can I do with that? Should I try to fix the other ones that are wrong/not yet ripped?
Can it function without these?
Thanks
@Jordan We don’t suggest adjusting the EMI stickers unless the port has a functional problem such as expansion cards not recognized or not charging. You can continue to use your system without the sticker and there should not be any difference in performance. If you want to, you can just place some tape over the top of the sticker to hold it together and reapply it.
@Adam_Hupp
Since you have a consistent repro case could you try downloading and installing the intel power gadget
And then running the following command:
“C:\Program Files\Intel\Power Gadget 3.6\PowerLog3.0.exe” -resolution 100 -duration 600 -file cpulog.csv
Which will log your cpu speed, temperatures, and throttling for about 10 minutes.
If you can start this as soon as the system boots, and capture a log of the system performing normally and then transitioning to the low frequency state it would be helpful for us.
If you can share the csv with us we can take a look and see if we can find the source of your throttling.
@Kieran_Levin it was reproducible for a day or two, the stopped. Possibly because I’m not using the pass-through power on my janky USB hub anymore. I installed that tool and will grab a log if it starts happening again.
So, I’ve noticed a slight cpu clock issue with mine. Occasionally, it’ll decide to run at .39 GHz while under load for 15-20 seconds, before slowly getting back up to normal speeds. I’m fairly sure it isn’t temp related as the cpu has run for extensive periods at 100% usage around 3.5 GHz and not had any issues, but, (as my most recent example) I was running around 25-30% utilization when it decided to clock down to .39 GHz for the aforementioned 15-20 seconds before speeding up again.
Is there anyway to find out what’s happening (like running some sort of terminal while I’m using the laptop to see if it throws an error code?). Additionally, and this could be the cause, I was running it with an egpu at the time. (should I follow the advice from this previous post?) ↓
Hi @FaultedBeing if you can follow the directions I wrote above using the power gadget it would be a good place to start.
Darn, It happened when I had the power gadget running, but not when I had it logging anything. I’ll try to get the log next time but I do have a screen shot of the power gadget’s graphs that cover the period of it dropping to .39 GHz and just when it started to recover.
edit: perhaps it is temperature related? I’m currently idling at low 90s with 40% usage?
@FaultedBeing You using an eGPU? (Just curious since you have nVidia card, I don’t think it has anything to do with our issue though)
@DannyT Yes I am, a 2060 Super. Its worked pretty well so far (with a little tweaking and disregarding this occasional issue (if it is caused by this) )
I’d also like to add to my findings, I just ran a benchmark to add some stress to the cpu and see if I could replicate it while logging, and it didn’t drop down to .39 GHz. Although, would there be any reason why the cpu would run hotter when connected to an eGPU rather than not?
I managed to start the log while the computer was at .39Ghz. Actually, it was stuck at .2Ghz this time and a restart was required to fix it.
As I was typing this it happened again, but fixed itself from .39Ghz
Here is a google drive link to a folder of the log of the intel power Gadget, both times started after the issue began (sorry I couldn’t capture before hand)
My clock has gotten stuck at 200MHz a few times on Ubuntu 21.04. A reboot has fixed the issue each time, but powertop shows all cores at 199 or 200MHz. I don’t have many other details yet since I just got the laptop today.
@Kieran_Levin I hit this again and was able to collect an Intel Power Gadget log. In this case it was at .39Ghz when I started the log. Then:
- Unplugged USB-C power, saw clock rise to normal speeds.
- Plugged it back in, saw clock drop to .39Ghz.
- Unplugged again, saw it rise again.
The log: Dropbox - PwrData_2021-9-7-11-49-14.csv - Simplify your life
When the clock is throttled the Intel Power Gadget shows “PROC HOT”, but the “package temp” is only 46c.
ThrottleStop reports “BD PROCHOT”:
This thread suggests that “BD PROCHOT” is from an off-cpu sensor that can signal heat problems from other areas of the laptop. Which explains why this is happening even though the reported package temp is fine.
Just from watching the “BD PROCHOT” indicator, it seems like the throttling only really kicks in when the GPU is reporting it. It’s weird to me that my GPU utilization is at 88+% even when I’m just sitting here in Chrome.
Thanks for the log! Note that the GPU utilization is likely expected, in that the GPU clock frequency is generally running quite low when idling or at desktop. For example, I’m sitting in Chrome with the GPU at ~200 MHz while the CPU is at 4 GHz.