Debugging dGPU not suspending?

Which Linux distro are you using?
Fedora 40

(If rolling release, last date updated?)

Which kernel are you using?
Linux 6.10.6-200.fc40.x86_64 x86_64

Which BIOS version are you using?
03.03

Which Framework Laptop 16 model are you using? (AMD Ryzen™ 7040 Series)
AMD Ryzen™ 7040 Series

So I have this issue where the dGPU seems to never suspend even if nothing is seemingly using it.
I have been using nvtop to see what is using the dGPU, then leaving ntop and then checking cat /sys/class/drm/card*/device/power/runtime_status
Which gives me

active
unsupported
unsupported
unsupported
active
unsupported
unsupported
unsupported
unsupported
unsupported
unsupported
unsupported
unsupported
unsupported
unsupported

So presumably that means they are both awake.
And my battery life (~4h) also indicates that it does indeed stay awake.

I have previously used supergfxctl to disable the dGPU entirely which extends my battery life by another 4 or so hours (up to around 8h in total).

It would be very nice to actually see this without entirely disabling the dGPU.

So does anyone know how to debug or otherwise fix this?

The most common cause to keeping dGPU awake is sensor monitoring software. Using nvtop also will keep it awake.

Yeah, but I only used nvtop to locate the processes.
The I exit ntop and wait, and it never suspends

I needed that suggestion it seems.
Started looking into everything I had installed, and it ended up being one of my gnome extensions keeping it alive.
So now it finally suspends. Thanks!

Wish it was easier to locate these things though

1 Like

Which Gnome extension (if you don’t mind my asking)?

It was Vitals.
I’ve been using it to keep an eye on memory usage.

Turns out I had “Monitor temperature” checked even though I wasn’t showing it, and that was keeping the dGPU awake for me.
So disabling that immediately suspended the dGPU.

image

1 Like

Thanks!

I’m going to be making a Linux Desktop my prime driver (so I’m coming up to speed with options & capabilities)…could you tell me if this is a viable (at least temporary) solution to the “bug”?

This is the (shortened) suggestion from ChatGPT:

  1. Create a Script for Managing the Vitals Extension:
    Create a script that will enable or disable the Temperature feature based on the suspend/resume state.
  2. Create a Systemd Service for Suspend/Resume:
    To automatically run this script during suspend/resume, create a new systemd service.

Is this viable? Let me know if you try it and it works (I’ve got at least a week’s wait before my laptop ships).

Thanks.

For Vitals specifically it is easy to solve. You just need to toggle the “Monitor temperature” to the off state and it will stop keeping the dGPU awake.

As for figuring out which other apps were misbehaving I just ended up using cat /sys/class/drm/card1/device/power/runtime_status which prints suspended when the dGPU is asleep, or active when it’s not.
So I have been using that to determine when it’s sleeping properly.

I ended up having to stop using a Chromium based browser, because it was just waking the dGPU constantly. It seems to be a well known problem when googling it.
So I ended up going with Firefox, which so far hasn’t woken it at all.
I’m glad to see that Firefox has been much improved over when I tried it many years ago

1 Like

I think you are describing a different problem (ensuring the dGPU will go into a low power state when suspending the whole computer) as opposed to the problem OP had (dGPU would not suspend while system is awake, in this case found to be due to a sensor monitoring application keeping it powered to monitor its temperature)

1 Like

Thanks…I didn’t catch that. Appreciated.