With UKI and Secureboot it is not so easy to install another kernel. But thanks for the info.
I have the same issue on Fedora 41 on my Framework 16 with the discrete GPU module. The last working kernel I had was 6.13.5. I missed 6.13.6, so I haven’t tested that. Kernels 6.13.7 and 6.13.8 appear to have this issue, which essentially prevents any gaming for me at the moment.
I tested kernel 6.13.8 on my Arch Linux desktop system which has Nvidia graphics and it’s fine, so this appears to be something with the AMD GPU and the kernel. I haven’t had time to poke around dmesg logs though.
My workaround was to downgrade to kernel 6.13.5 until the issue is fixed in a newer version.
If a mod wants to merge my thread with this one, I think that could make sense. Both of these seem to be addressing the same/very similar issue.
dGPU Crashing - Fedora 41 - Framework Laptop 16 - Framework Community
Unfortunately I can’t test my Arch install further right now because my UEFI boot option on that drive appears to have disappeared, and the guide I wrote for fixing that issue no longer seems to be doing the trick.
Edit: I have put in a support ticket for this issue.
I switched from the relatively stable Arch install I’ve been using for the couple years to Fedora 41 over the weekend, and am having an issue where the dGPU seems to be crashing. Can anyone make sense of this journal log? The dGPU works fine on Windows, and I’ve not made any kernel tweaks beyond whatever has been provided by default.
I’m on the Plasma variation of Fedora 41.
When I first boot, the 7700s shows up in nvtop as a device, but eventually dissapears (e.g. after launching steam, but I don’t know if launching steam is a coincidence or not).
I’ve looked over the Fedora setup guide and see nothing related to any special configuration needed to make the dGPU work.
Mar 24 20:32:23 fedora kernel: amdgpu 0000:03:00.0: amdgpu: MES FW versoin must be larger than 0x63 to support limit single process feature.
Mar 24 20:32:23 fedora kernel: amdgpu 0000:03:00.0: amdgpu: failed to change_config.
Mar 24 20:32:23 fedora kernel: amdgpu 0000:03:00.0: amdgpu: resume of IP block <mes_v11_0> failed -22
Mar 24 20:32:23 fedora kernel: amdgpu 0000:03:00.0: amdgpu: amdgpu_device_ip_resume failed (-22).
Mar 24 20:32:25 fedora kernel: [drm] Register(0) [regUVD_PGFSM_STATUS] failed to reach value 0x00800000 != 0x00c00000n
Mar 24 20:32:25 fedora kernel: amdgpu 0000:03:00.0: [drm:jpeg_v4_0_set_powergating_state [amdgpu]] *ERROR* amdgpu: JPEG enable power gating failed
Mar 24 20:32:25 fedora kernel: [drm:amdgpu_device_ip_set_powergating_state [amdgpu]] *ERROR* set_powergating_state of IP block <jpeg_v4_0> failed -110
Mar 24 20:32:25 fedora kernel: [drm] Register(0) [regUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000003n
Mar 24 20:32:25 fedora kernel: [drm] Register(0) [regUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000003n
Mar 24 20:32:25 fedora kernel: amdgpu 0000:03:00.0: amdgpu: SMU: response:0xFFFFFFFF for index:43 param:0x00000000 message:PowerDownVcn?
Mar 24 20:32:25 fedora kernel: amdgpu 0000:03:00.0: amdgpu: Failed to power gate VCN!
Mar 24 20:32:25 fedora kernel: [drm:vcn_v4_0_stop [amdgpu]] *ERROR* Dpm disable uvd failed, ret = -121.
Mar 24 20:32:25 fedora kernel: amdgpu 0000:03:00.0: amdgpu: SMU: response:0xFFFFFFFF for index:36 param:0x00000001 message:SetWorkloadMask?
Mar 24 20:32:25 fedora kernel: amdgpu 0000:03:00.0: amdgpu: Failed to set workload mask 0x00000001
Mar 24 20:32:25 fedora kernel: amdgpu 0000:03:00.0: amdgpu: (-121) failed to disable video power profile mode
lspci
03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 33 [Radeon RX 7600/7600 XT/7600M XT/7600S/7700S / PRO W7600] (rev c1)
vulkan summary shows this warning:
WARNING: [../src/amd/vulkan/radv_physical_device.c:2025] Code 0 : Could not open device /dev/dri/renderD128: Invalid argument (VK_ERROR_INCOMPATIBLE_DRIVER)
TU: error: ../src/freedreno/vulkan/tu_knl.cc:385: failed to open device /dev/dri/renderD128 (VK_ERROR_INCOMPATIBLE_DRIVER)
==========
VULKANINFO
==========
Vulkan Instance Version: 1.4.304
Instance Extensions: count = 24
-------------------------------
VK_EXT_acquire_drm_display : extension revision 1
VK_EXT_acquire_xlib_display : extension revision 1
VK_EXT_debug_report : extension revision 10
VK_EXT_debug_utils : extension revision 2
VK_EXT_direct_mode_display : extension revision 1
VK_EXT_display_surface_counter : extension revision 1
VK_EXT_headless_surface : extension revision 1
VK_EXT_surface_maintenance1 : extension revision 1
VK_EXT_swapchain_colorspace : extension revision 5
VK_KHR_device_group_creation : extension revision 1
VK_KHR_display : extension revision 23
VK_KHR_external_fence_capabilities : extension revision 1
VK_KHR_external_memory_capabilities : extension revision 1
VK_KHR_external_semaphore_capabilities : extension revision 1
VK_KHR_get_display_properties2 : extension revision 1
VK_KHR_get_physical_device_properties2 : extension revision 2
VK_KHR_get_surface_capabilities2 : extension revision 1
VK_KHR_portability_enumeration : extension revision 1
VK_KHR_surface : extension revision 25
VK_KHR_surface_protected_capabilities : extension revision 1
VK_KHR_wayland_surface : extension revision 6
VK_KHR_xcb_surface : extension revision 6
VK_KHR_xlib_surface : extension revision 6
VK_LUNARG_direct_driver_loading : extension revision 1
Instance Layers: count = 5
--------------------------
VK_LAYER_FROG_gamescope_wsi_x86_64 Gamescope WSI (XWayland Bypass) Layer (x86_64) 1.3.221 version 1
VK_LAYER_MANGOHUD_overlay_x86 Vulkan Hud Overlay 1.3.0 version 1
VK_LAYER_MANGOHUD_overlay_x86_64 Vulkan Hud Overlay 1.3.0 version 1
VK_LAYER_MESA_device_select Linux device selection layer 1.4.303 version 1
VK_LAYER_VKBASALT_post_processing a post processing layer 1.3.223 version 1
Devices:
========
GPU0:
apiVersion = 1.4.305
driverVersion = 25.0.1
vendorID = 0x1002
deviceID = 0x15bf
deviceType = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU
deviceName = AMD Radeon 780M (RADV PHOENIX)
driverID = DRIVER_ID_MESA_RADV
driverName = radv
driverInfo = Mesa 25.0.1
conformanceVersion = 0.0.0.0
deviceUUID = 00000000-c500-0000-0000-000000000000
driverUUID = 414d442d-4d45-5341-2d44-525600000000
GPU1:
apiVersion = 1.4.305
driverVersion = 0.0.1
vendorID = 0x10005
deviceID = 0x0000
deviceType = PHYSICAL_DEVICE_TYPE_CPU
deviceName = llvmpipe (LLVM 19.1.7, 256 bits)
driverID = DRIVER_ID_MESA_LLVMPIPE
driverName = llvmpipe
driverInfo = Mesa 25.0.1 (LLVM 19.1.7)
conformanceVersion = 1.3.1.1
deviceUUID = 6d657361-3235-2e30-2e31-000000000000
driverUUID = 6c6c766d-7069-7065-5555-494400000000
I captured another vulkan summary before the dGPU crashed and this is what it produced, but the graphics card disappeared almost immediately after running that.
==========
VULKANINFO
==========
Vulkan Instance Version: 1.4.304
Instance Extensions: count = 24
-------------------------------
VK_EXT_acquire_drm_display : extension revision 1
VK_EXT_acquire_xlib_display : extension revision 1
VK_EXT_debug_report : extension revision 10
VK_EXT_debug_utils : extension revision 2
VK_EXT_direct_mode_display : extension revision 1
VK_EXT_display_surface_counter : extension revision 1
VK_EXT_headless_surface : extension revision 1
VK_EXT_surface_maintenance1 : extension revision 1
VK_EXT_swapchain_colorspace : extension revision 5
VK_KHR_device_group_creation : extension revision 1
VK_KHR_display : extension revision 23
VK_KHR_external_fence_capabilities : extension revision 1
VK_KHR_external_memory_capabilities : extension revision 1
VK_KHR_external_semaphore_capabilities : extension revision 1
VK_KHR_get_display_properties2 : extension revision 1
VK_KHR_get_physical_device_properties2 : extension revision 2
VK_KHR_get_surface_capabilities2 : extension revision 1
VK_KHR_portability_enumeration : extension revision 1
VK_KHR_surface : extension revision 25
VK_KHR_surface_protected_capabilities : extension revision 1
VK_KHR_wayland_surface : extension revision 6
VK_KHR_xcb_surface : extension revision 6
VK_KHR_xlib_surface : extension revision 6
VK_LUNARG_direct_driver_loading : extension revision 1
Instance Layers: count = 5
--------------------------
VK_LAYER_FROG_gamescope_wsi_x86_64 Gamescope WSI (XWayland Bypass) Layer (x86_64) 1.3.221 version 1
VK_LAYER_MANGOHUD_overlay_x86 Vulkan Hud Overlay 1.3.0 version 1
VK_LAYER_MANGOHUD_overlay_x86_64 Vulkan Hud Overlay 1.3.0 version 1
VK_LAYER_MESA_device_select Linux device selection layer 1.4.303 version 1
VK_LAYER_VKBASALT_post_processing a post processing layer 1.3.223 version 1
Devices:
========
GPU0:
apiVersion = 1.4.305
driverVersion = 25.0.1
vendorID = 0x1002
deviceID = 0x15bf
deviceType = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU
deviceName = AMD Radeon 780M (RADV PHOENIX)
driverID = DRIVER_ID_MESA_RADV
driverName = radv
driverInfo = Mesa 25.0.1
conformanceVersion = 0.0.0.0
deviceUUID = 00000000-c500-0000-0000-000000000000
driverUUID = 414d442d-4d45-5341-2d44-525600000000
GPU1:
apiVersion = 1.4.305
driverVersion = 25.0.1
vendorID = 0x1002
deviceID = 0x7480
deviceType = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU
deviceName = AMD Radeon RX 7700S (RADV NAVI33)
driverID = DRIVER_ID_MESA_RADV
driverName = radv
driverInfo = Mesa 25.0.1
conformanceVersion = 0.0.0.0
deviceUUID = 00000000-0300-0000-0000-000000000000
driverUUID = 414d442d-4d45-5341-2d44-525600000000
GPU2:
apiVersion = 1.4.305
driverVersion = 0.0.1
vendorID = 0x10005
deviceID = 0x0000
deviceType = PHYSICAL_DEVICE_TYPE_CPU
deviceName = llvmpipe (LLVM 19.1.7, 256 bits)
driverID = DRIVER_ID_MESA_LLVMPIPE
driverName = llvmpipe
driverInfo = Mesa 25.0.1 (LLVM 19.1.7)
conformanceVersion = 1.3.1.1
deviceUUID = 6d657361-3235-2e30-2e31-000000000000
driverUUID = 6c6c766d-7069-7065-5555-494400000000
`~~
I believe you are experiencing an issue that’s been reported by several other users on this forum where a regression in kernel 6.13.? causes the dedicated GPU to crash, there seems to be evidence that MESA 1.25 and/or the latest AMD firmware is also involved in the issue.
For Fedora switching to the LTS kernel which should still be on 6.12 should resolve, possibly with downgrading mesa to 1.24 as well.
in testing the latest arch kernel (6.13.8) appears to have resolved the issue, but more testing is needed to confirm that it has and its not that i’m just not observing any problems with it yet. If this version does resolve the issue then it should hit fedora mainline relatively soon too.
I looked for several minutes before posting my thread for something similar, but couldn’t find anything. Re-checking now I can see that I did a poor job of it. I’m linking to related thread(s) here for completion and convenience.
If I’ve missed one, feel free to share it.
I had switched from Arch to Fedora largely because I decided I was willing to trade “bleeding edge” updates in exchange for stability and reliability. This is quite a dissapointing intro to the distro on my FW 16.
Fedora may be a bit slower moving than Arch, but it’s still a fast paced distro and is liable to cause a cut or three from time to time.
if you want stability and reliability you want an ubuntu based distro, or if Canonical’s shenanigans are not your cup of tea then a debian based distro.
some other threads that may help you isolate the issue and apply a workaround (Install and use LTS kernel for the moment seems to be the best solution for the moment)
Hmm, I’ll give it some thought as to whether or not I want to stick with Fedora then. My Arch install just keeps working, but the longer it works the more I forget regarding how to fix it.
Thanks for sharing. The first one looks to be the same/similar issue.
It’s just annoying when the support asks if Fedora has been tried, since this is probably the reference distro. That’s why I currently also have Fedora running, for example, because I have a few bugs with the graphics card.
I went back and installed both 6.13.5 and 6.13.6.
Kernel’s that have not worked:
6.13.8
6.13.7
6.13.6
Kernel’s that did work:
6.13.5
Running 6.13.5 would appear to be a feasible workaround for the time being.
I used the instructions for Fedora 41 here: Installing Kernel from Koji :: Fedora Docs
I’ll also note that my touchpad didn’t work when I booted into 6.13.5, but I haven’t looked into it yet. Update: It looks like this may be the old touchpad bug. When I rebooted and waited 10-30 seconds before logging in, the touchpad worked.
I think the issue is the one reported here: dGPU crashing - MES FW versoin must be larger than 0x63 to support limit single process feature. (#4083) · Issues · drm / amd · GitLab
Using the kernel with the AMD patch in it I was able to suspend and wake the machine 10 times in a row so far.
I can confirm this, on Fedora Silverblue with the following command backtracked to the kernel:
sudo rpm-ostree override replace https://koji.fedoraproject.org/koji/buildinfo?buildID=2667987
You would need to reset the override when your fix has been landed in the repos with
sudo rpm-ostree override reset <kernel packages that have been overridden>
.
I’m relatively new to rpm. Would this workaround strategy allow me to continue updating other packages? Right now I’m avoiding updating my system packages because it looks like it would override my kernel downgrade.
I can’t say for sure, it’s best to ask in the forum. As far as I understand it, yes, because it is a replacement.
I’ve noticed this thread is being merged and cited in a lot of different places. Perhaps the title should be changed to something a little more eye catching now that we know more about the issue. Perhaps something like dGPU Non-functional w/ Linux Kernel 6.13.6 -> 6.13.8
I am currently being offered 6.13.9, has anyone tried it yet?
I have not. I checked last night and only 6.13.8 was available, so it must’ve been released today.
Yesterday I performed an update in which the kernel 6.13.9 was applied, but the override was retained.
Have you checked to see whether the issue persists in the 6.13.9 update?