Irregularly high iGPU utilisation

Well, I thought I did. I just went back and checked my work and the patchfile I created seemed to be incomplete. Error on my part :confused:

Im going to start fresh and Ill let you know the results.

Fixed my diff file to apply the patch. I get the following error:

$ patch -p1 < ~/framework.diff 
patching file drivers/gpu/drm/amd/pm/amdgpu_pm.c
Hunk #29 succeeded at 2250 with fuzz 1 (offset -36 lines).
Hunk #30 succeeded at 2652 with fuzz 1 (offset 345 lines).
Hunk #31 succeeded at 2956 (offset 187 lines).
Hunk #32 succeeded at 2988 (offset 187 lines).
Hunk #33 succeeded at 3128 with fuzz 1 (offset 269 lines).
Hunk #34 succeeded at 3209 with fuzz 1 (offset 305 lines).
Hunk #35 succeeded at 3663 with fuzz 1 (offset 729 lines).
Hunk #36 succeeded at 3747 with fuzz 1 (offset 749 lines).
Hunk #37 succeeded at 4626 with fuzz 1 (offset 1597 lines).
Hunk #38 succeeded at 4671 with fuzz 1 (offset 1598 lines).
Hunk #39 FAILED at 3105.
Hunk #40 FAILED at 3245.
Hunk #41 FAILED at 3326.
Hunk #42 FAILED at 3784.
Hunk #43 FAILED at 3868.
Hunk #44 FAILED at 4755.
Hunk #45 FAILED at 4800.
7 out of 45 hunks FAILED -- saving rejects to file drivers/gpu/drm/amd/pm/amdgpu_pm.c.rej

Hmm, getting a little closer maybe? Doing some googling on “patch” I tried the following:

$ patch --verbose --dry-run --ignore-whitespace --fuzz 0 -p1 < ~/framework.patch 

And only 2 Hunks failed:

Hunk #29 FAILED at 2286.
Hunk #30 FAILED at 2307.

If you look at the rej file at those two hunks you can manually fix them. My kernel base is probably just newer than yours. If you can’t figure it out let me know and I’ll try to do a backport for you.

1 Like

@Mario_Limonciello here is the output (sorry for the delay):

$ cat drivers/gpu/drm/amd/pm/amdgpu_pm.c.rej
--- drivers/gpu/drm/amd/pm/amdgpu_pm.c
+++ drivers/gpu/drm/amd/pm/amdgpu_pm.c
@@ -2286,7 +2286,7 @@ static ssize_t amdgpu_get_pm_policy_attr(struct device *dev,
 
 	if (amdgpu_in_reset(adev))
 		return -EPERM;
-	if (adev->in_suspend && !adev->in_runpm)
+	if (adev->in_suspend || adev->in_runpm)
 		return -EPERM;
 
 	return amdgpu_dpm_get_pm_policy_info(adev, policy_attr->id, buf);
@@ -2307,7 +2307,7 @@ static ssize_t amdgpu_set_pm_policy_attr(struct device *dev,
 
 	if (amdgpu_in_reset(adev))
 		return -EPERM;
-	if (adev->in_suspend && !adev->in_runpm)
+	if (adev->in_suspend || adev->in_runpm)
 		return -EPERM;
 
 	count = min(count, sizeof(tmp_buf));

I’m not really sure what to do w/this info sadly.

Here’s a backport to 6.10.y you can use.

https://git.kernel.org/pub/scm/linux/kernel/git/superm1/linux.git/commit/?h=superm1/runtime-pm-dgpu-6.10.y&id=1a39cb3fdabd04e333c1cf012c08487d99fb2cf0

Thank ya sir. I am building it now.

I did have to run the patch command 2x. But after doing so, I gave no errors :man_shrugging:

When I was researching that error this weekend, that was something I had come across as well.

Will see what happens.

Afternoon @Mario_Limonciello - I applied the patch, and # sensors does not wake the GPU when being ran. I can also confirm btop does not wake the GPU.

As expected, nvtop does wake the GPU.

The GUI program “Mission Center” does wake the GPU.

But this may be a “them” issue. Here is the link to their git repo - I am not sure what they use to pull their status info:

FWIW I only used this to test your patch. This is a non-issue for me (I only check temps w/ECTool now).

I am not positive if @Numerfolt is using the same software I used to test.

Can you also try nvtop? I bet it wakes it.

But as long as sensors works correctly now it’s on the right path and yes any software still waking it needs to have further adjustments.

1 Like

Not sure if you saw my edit - nvtop does wake the GPU again.

I am going to stay on this kernel for a while - that way if you need me to provide anything I can.

OK that makes perfect sense, thanks! If/after this change is accepted into an upcoming kernel it’s best to report a bug to both tools that they should be modified to improve their behavior to avoid waking the dGPU if already asleep.

I’ve submitted it for review here:
https://lore.kernel.org/amd-gfx/20240820020435.472490-1-superm1@kernel.org/T/#u

1 Like

I am using the default kde system monitor, but wont be able to try out the kernel patch at the moment.

1 Like