It’s the pd controller firmware again isn’t it? Man I wish we also got access to source and documentation for that.
Imagine how polished an implementation that would eventually become with the community working on it.
It’s the pd controller firmware again isn’t it? Man I wish we also got access to source and documentation for that.
Imagine how polished an implementation that would eventually become with the community working on it.
@Quin_Chou can we get an update on progress? It’s been almost another month since this buggy release. I’m really tired of my ports not accepting charging, the battery draining too fast in sleep mode, and CPU limited low power while dGPU is off. It would be really nice if framework showed us they actually care by fixing these long standing issues.
Regarding the “frequency lock low” problem.
Can you describe the problem a bit better by raising an Issue over here:
I would like to understand how to reproduce the problem, as I don’t see it currently.
So, if you put enough details in that a new Issue there, we might be able to make progress.
Framework is already aware of this issue, and has claimed to have “solved” it since at least 3 updates. Triggering it is quite easy, although probabilistic, so maybe you’d have to adjust the timings. A reliable procedure is as follow (note that you need a dGPU plugged and in D3 mode for this to happen):
systemctl suspendThe only way out of this state without rebooting is power cycling the dGPU (any adjustment using either the RMI or the PSMU/MP1 interfaces has no effect (from what i can tell at least)):
# echo on > /sys/bus/pci/devices/0000\:03\:00.0/power/control
# echo auto > /sys/bus/pci/devices/0000\:03\:00.0/power/control
Hi,
I do not have a dGPU, so I cannot reproduce it.
Maybe this info might help you.
If you are using Linux, from that item, try:
sudo echo '\_SB.ALIB 0x0c {0x07,0x00,0x23,0x00,0x3a,0x00,0x00}' >/proc/acpi/call
Summary:
When the dGPU goes to sleep, the temp threshold is accidentally set to 0 C. So on wakeup, it thinks the dGPU is overheating, and backs everything off.
That echo command sets the threshold back to a normal value.
I’m actually the guy who wrote the quoted post ![]()
The 545MHz issue is a different one, it is not solved by the command i mentioned in the post.
What about if one also takes into account this:
Summary:
The EC might fail to send the PMF SPL, sPPT, fPPT, p3T, ao_sppt settings to the APU/CPU.
In your analysis, are you able to read the current PMF settings from the APU/CPU side?
e.g. ryzenadj
On BIOS 4.0.4
The EC console setting of SPL, sPPT, fPPT do not actually match the ryzenadj output.
sPPT, fPPT match, but the ryzenadj view of STAPM LIMIT is about 5000mW above the EC PMF SPL one.
I don’t know what is adjusting the SPL up above what the EC console outputs.
In BIOS 3.0.5, STAPM from ryzenadj matches the EC SPL console output.
It’s because the settings programmed via the different interfaces are handled by a common endpoint: the AMD SMU.
The AMD SMU (or at least its latest version) seems to ignore SPL changes sent via the I2C RMI interface that the EC uses. In fact, this interface is extremely limited in the ways it can directly influence the SMU behavior. On the other hand, the direct PCI access to the PSMU/MP1 that ryzenadj uses (via ryzen_smu), and that are also used by an ACPI ALIB callback in response to some EC events, allows to dump/edit the real current PM table values used by the SMU.
I tried several times to reproduce this following your steps, but I was unable to. I even disabled my LED matrix monitoring software just in case it was interfering with the power states. dGPU in D3cold, using the framework 240w psu.
In all this testing, I was actually able to get it ino the 545MHz mode once while I kept the dGPU awake using glxgears though…but it came right out of that mode as soon as I reconnected the power cable.
Where is the “APU SMU” logic stored? If you know please point me to it. Is it the a particular ACPI table or is it an AMD binary blob. I have access to advanced reverse engineering tools, so if it is an AMD binary blob, i can reverse it, and at least maybe document the logic it is currently using.
Another aspect, do we know all the input parameters that the logic uses to make it decisions. I.e. which PMF, watt, temp, prochot gpios it uses as input, power mode, pmf tables.
If we get all the input values, and the SMU logic, we should be able to track the bug down.
Well, due to its probabilistic nature it’s possible that subtle setup variations affect the steps that can trigger it…
It’s inside the AMD AGESA firmware.
I have found that some of the PMF settings are read only from the cpu side. But work from the EC->SMU I2C interface.
I should also mention that i recently found a bug in the EC I2C source code that caused some of the settings from the EC to SMU to simply not get through.
The bug was causing i2c writes being silently lost. I have fixed the bugs in my EC code.
The current EC code only adjusts watts limits of the SMU. I don’t think it adjusts temp limits.