Unable to throttle dGPU wattage on Radeon RX 7700S

Edit: It isn’t currently possible to throttle power on the dGPU on Linux, but it is possible to overclock it by 20%. The currently supported range of the power1_cap file on Linux is between 10000000 and 12000000; see steps to change this below.


Hey all. I’ve been trying to tinker with power settings, primarily because the fans are loud on one of my games and I want to see if I can have good performance when the fans are quiet.

Studying the power management Arch Wiki and subsequent Ryzen and AMDGPU Arch Wikis, I’ve begun by looking into the following method.

(Edit: Update, the same steps affect Ubuntu 24.04 as well, as I have confirmed.)

  1. Add amdgpu.ppfeaturemask=0xfff7ffff to the kernel parameters
  2. Write a smaller value to the dGPU’s power1_cap file.

The only power1_cap file I see is in /sys/class/drm/card1/device/hwmon/hwmon6/ which I’m hoping is the dGPU and not the iGPU. But I’m just embarking on this journey, so feedback welcome. At any rate, even with elevated privileges, I’m unable to change this file, due to permission error or disk full error.

I’ve explained what I’m trying to do and what I’ve tried. Can anyone speak intelligently to this? I would love to be able to control this more.

Edit: System Details:

  • System: Framework Laptop 16
  • CPU: AMD Ryzen 9 7940HS
  • iGPU: AMD ATI Radeon 780M
  • dGPU: AMD ATI Radeon RX 7700S
  • OS: Both Arch Linux (6.9.6-arch1-1) and Ubuntu 24.04
  • Bootloader: rEFInd

/boot/refind_linux.conf: Hastebin
sudo dmesg: Hastebin

You can very that easily by doing:
cat /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap
(Beware, cardX and hwmonX numbers (often) changes between reboots, especially with multiple GPUs in one system!)

If it is less than 50000 (50 watts) its an integrated GPU. If its more, its your dedicated GPU.

In practice, I’m almost sure its your dedicated GPU since you usually can’t mess around with the power cap of your integrated GPU as its part of a complex algorithm that plays both on the CPU and GPU power limits, for example if you have a 100W TDP CPU, it means that either the CPU can go up to 65W and the iGPU can have 35W (simplification), or you can get like 45W on the GPU and the rest on the CPU. In practice there still if an individual TDP for the CPU (like 65W in the example) and the GPU (45W in the example) that they will never go beyond so you cannot have your CPU to 99W and GPU to 1W, it will just allow for dynamical allocation of resources when needed.

Beware that some mobile GPUs don’t allow their TDP to be modified, idk about this GPU but as its AMD and framework I believe they will let you modify it. Make sure the kernel parameter is added and you regenerated your grub config, then reboot and then try to modify the value.

Also for example to modify the TDP to 100W, you need root priviledges to echo 100000 > /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap, sudo echo 100000 > /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap will NOT work because of the way commands redirections are handled in linux you will not write the file as root but echo as root only! You need to sudo su then enter that command.

Please post more details about your OS, bootloader (grub/systemd-boot?), GPU, and system! I assume you are using a FW 16 w/ dedicated amd GPU, please next time consider telling us your configuration especially for technical tickets like this. Also please post your error, and the result of the dmesg (it will allow me to see if your kernel parameters are set right, redact anything you don’t wish to share from there) command!

PS: amdgpu.ppfeaturemask=0xffffffff usually doesn’t work, especially for mobile GPUs, because you cannot unlock absolutely everything so it won’t apply and won’t unlock anything! You need to run the command
printf 'amdgpu.ppfeaturemask=0x%x\n' "$(($(cat /sys/module/amdgpu/parameters/ppfeaturemask) | 0x4000))"
and use that as the kernel parameter, like the wiki says to do!

A common trick to accomplish this with a single command line is to use tee:

echo 100000 | sudo tee /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap

(Note that this will also echo to the standard output, and you could suppress that by adding >/dev/null in the end of the line. But, in this particular case, that’s hardly worth making the command longer.)

Yeah tee is wayyy better than echo or cat imo as it works with a direct sudo call. Thanks for the heads up.

I usually use

sudo tee /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap <<'EOF'
100000
EOF

That even works with multiple line files and variables won’t get interpreted. Pretty dope.

Thanks for chiming in, @Adr and @Wrybill_Plover!

That’s good to know–then yes, I can confirm it’s the dGPU.

Oh, I know. Here’s the output of tee, for what it’s worth (which is identical to what @Wrybill_Plover mentioned).

~ echo 50000000 | sudo tee /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap
Place your right index finger on the fingerprint reader
50000000
tee: /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap: Invalid argument

You’re right assuming it’s a Framework Laptop 16 with the AMD Radeon™ RX 7700S, but here are the other details, including dmesg and bootloader config file.

  • System: Framework Laptop 16
  • CPU: AMD Ryzen 9 7940HS
  • iGPU: AMD ATI Radeon 780M
  • dGPU: AMD ATI Radeon RX 7700S
  • OS: Arch Linux (6.9.6-arch1-1)
  • Bootloader: rEFInd

/boot/refind_linux.conf: Hastebin
sudo dmesg: Hastebin

Great to know the right way, but I did try both, and anyway you can see I have 0xfff7ffff in my bootloader/dmesg.

Can someone confirm they’ve actually done this on the Framework Laptop 16 dGPU? No one has, which makes me skeptical it works for anyone?

That means the TDP value is invalid, but you have successfully unlocked your GPU via

Which by the way, this is the right combination for your GPU so that means it works.

Try lower and higher tdp values, like 100W or 30W. If you always get this error then maybe it is locked, try asking framework support they will know better than me, but I really think that error message means its not locked but just the TDP is out of range.

I have a friend with a FW 16 however he doesn’t have the RX 7700, so I won’t be able to test that for you.

Oh now. That is very interesting. You’re right, 100W works–and it’s the only one that works (the last in a sequence of tries, starting from 10W, below. I will put in a support ticket for this sometime.

➜  ~ echo 10000000 | sudo tee /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap
10000000
tee: /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap: Invalid argument
➜  ~ echo 20000000 | sudo tee /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap
20000000
tee: /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap: Invalid argument
➜  ~ echo 30000000 | sudo tee /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap
30000000
tee: /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap: Invalid argument
➜  ~ echo 40000000 | sudo tee /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap
40000000
tee: /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap: Invalid argument
➜  ~ echo 50000000 | sudo tee /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap
50000000
tee: /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap: Invalid argument
➜  ~ echo 60000000 | sudo tee /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap
60000000
tee: /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap: Invalid argument
➜  ~ echo 70000000 | sudo tee /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap
70000000
tee: /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap: Invalid argument
➜  ~ echo 80000000 | sudo tee /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap
80000000
tee: /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap: Invalid argument
➜  ~ echo 90000000 | sudo tee /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap
90000000
tee: /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap: Invalid argument
➜  ~ echo 100000000 | sudo tee /sys/class/drm/card1/device/hwmon/hwmon6/power1_cap
100000000

Hah, interesting. Maybe its a driver bug? Worth asking framework support team and investigating. 100W happens to be the TGP (=TDP but for GPUs) of this GPU; maybe its just that this cannot be changed, maybe its a bug, or maybe there is a mode where you can unlock it but it requires disabling the iGPU for reasons mentioned here: FW 13 Desktop iGPU + dGPU - #4 by Adr

iGPU/dGPU/external GPU dynamics are complex, especially on a laptop where power usage is the utmost priority!

1 Like

Hey, update on this issue (which is still ongoing):

I did start a ticket with the Framework team, who basically asked me to repro the issue on the officially supported distribution, which I did, and shared the info they wanted about that environment. Haven’t heard back yet.

They did share this recent patch, possibly related. I don’t have time to compile my kernel to test that, but I did email the folks on the patch to get details and see if they know.

Nice, yeah that’s promising! I think your issue will be fixed anyways. It’s a pretty major one and may touch everybody imo.

Thanks! I hope so.

I also heard back from the AMD engineer on that patch; he said:

The driver limits the power caps to what was validated by the AIB or OEM. If you want to use lower power caps, you’ll need to remove those checks from the driver.

And looking into it more, I see this very interesting thread from March. Apparently AMD has gotten absolutist about requiring GPU AIBs/OEMs to implement their own power ranges, or else it doesn’t let you change anything. As such, “you’ll need to remove those checks from your driver,” means I need to perpetually patch my kernel, or hope and beg Framework to fix it in a firmware update, if they even can.

Actually, I have found you can overclock and raise the value as high as 120W (or 12000000). So the range is 10000000 - 12000000 at this time.

well this is bewildering. i just updated from ubuntu 22.04 lts to 24.04 lts and discovered i couldn’t throttle my 7700s anymore, and a lot of searching EVENTUALLY landed me here.

truly bizarre and frustrating.

edit: as a side note, with the 6.5 kernel, the range was 0 to 100. now it’s 100 to 120. this is kinda absurd to me, we already have power issues on this machine without raising that value.

WHY are these the limits? are they some weird default that amd set that framework didn’t alter because they used to not be relevant? did framework ACTUALLY set these?