[TRACKING] Hard freezing on Fedora 36 with the new 12th gen system

just upgraded F36 (KDE) from 6.0.5, to 6.0.8 (and whatever other packages i had outstanding), removed the psr line with sudo grubby --update-kernel=ALL --remove-args="i915.enable_psr=0" and while i havent seen anything bad in dmesg, it is blocking me from unlocking the laptop. specifically wake laptop up, enter passwd/fingerprint click the ‘unlock’ button… nothing. acts like i haven’t clicked it.

re-applying sudo grubby --update-kernel=ALL --args="i915.enable_psr=0", reboot, login, sleep and wakeup does not exhibit this problem.

1 Like

My 12th gen board is installed, Fedora 37 installed as well. Going to spend some time trying to replicate this. I’m also following this thread as well.

Going to leave GNOME Software open for an hour, see if I can replicate any of this.

also

@nadb Docks and Bluetooth will be a challenge I suspect. For now (as was stated above I believe) btusb.enable_autosuspend=0 is one option.

sudo grubby --update-kernel /boot/vmlinuz-LINUX-KERNEL-VERSION --args="btusb.enable_autosuspend=0"

Fedora 37
Kernel 6.0.7-301…

I’ve bookmarked this for an update for tomorrow.

@Matt_Hartley I have not run into the hard freeze since updating to kernel 6.0.7. Now on 6.0.8. I am beginning to suspect the power delivery issue is actually something to do with the way Thunderbolt 4 handles it, or rather it is not recognizing it properly. There has been commentary elsewhere that a thunderbolt 4 cable might fix it, but I am planning on upgrading anyway, so not really a big deal. Just annoying, and in this case probably not a Framework firmware issue.

1 Like

The lockups is very random, I think, so it’s quite hard to replicate. I used to experience it without i915.enable_psr=0. With this kernel param, there’s no more lockups for normal usages, but it still freezes when I share my screen on a video call.

I deep dive into the kernel config and found out that the "Asynchronous wait on fence ... time out" in kernel log may have something to do with these configurations: linux/drivers/gpu/drm/i915/Kconfig.profile at master · torvalds/linux · GitHub

For now I’m trying. I still cannot confirm if the problem has been fixed or not.

sudo -i
echo 60000 > /sys/class/drm/card0/engine/rcs0/preempt_timeout_ms;
echo 60000 > /sys/class/drm/card0/engine/rcs0/heartbeat_interval_ms;
echo 60000 > /sys/class/drm/card0/engine/rcs0/stop_timeout_ms;
1 Like

i’d just like to add that the issue isn’t framework specific. that ‘GPU BUG: 0:12:0.0000’ or whatever it is across a number of manufacturers like acer and lenovo hardware.

i’ve upgraded to F37 last night (after my last post) too.

In the time i’ve been using it (1-2weeks) i havent found any negative concequences with psr=0. I havent looked into what could occur or even what its purpose is so i’m happy enough to keep running it long term.
I took a quick glance and a page search of psr and i915 on the kernel.org changelog pages, but couldnt see anything since 6.0.5 to indicate any psr or i915 related freezing issues have been worked upon.

Okay, appreciate the update on this. I’ve been living in Fedora 37 on the 12th gen myself. No issues to speak of on 6.0.8.300.fc3

Just an update on what I’ve been seeing. I had a hard freeze seemingly at random earlier today and then another just now while I was playing the tower swap browser game on chrome, so I think maybe the game itself can independently cause freezes for me. I’m confident that this game wasn’t triggering freezes when I was still using fedora 35 since I was playing it for upwards of several hours at a time without any issues.

If anyone else wants to try to replicate, I’m using chrome version Version 107.0.5304.110 (Official Build) (64-bit), the game can be found here: Tower Swap, and my system information is below (copied from the page in KDE settings).

Operating System: Fedora Linux 36
KDE Plasma Version: 5.25.5
KDE Frameworks Version: 5.99.0
Qt Version: 5.15.6
Kernel Version: 6.0.8-200.fc36.x86_64 (64-bit)
Graphics Platform: Wayland
Processors: 16 × 12th Gen Intel® Core™ i5-1240P
Memory: 15.3 GiB of RAM
Graphics Processor: Mesa Intel® Graphics
Manufacturer: Framework
Product Name: Laptop (12th Gen Intel Core)
System Version: A4

Does anyone know of a way to prevent or reduce these freezes? They’ve become much more frequent for me since updating to fedora 36. I’ve had 2 soft freezes (laptop unfroze itself after a bit) and 2 hard freezes (I had to restart the laptop) happen completely at random just today. I think I’ve had more random freezes in the week since I updated to fedora 36 than in ~2 months when I was using fedora 35.

EDIT to add additional information. I found a reddit thread that recommended checking the kernal log using the following command.

journalctl -k -b -1 -n 10000 >~/kernel-log-$(date -Iseconds).txt

When I ran the command, there was a section with timestamps that match when my laptop most recently froze that look like they might be related to the freeze. I don’t have experience with these logs, so I’m not sure if the “stopped heartbeat” is actually the freeze. If it is the freeze, is there any way to fix that?

Nov 16 22:02:47 fedora kernel: Asynchronous wait on fence 0000:00:02.0:kwin_wayland[1873]:ac680 timed out (hint:intel_atomic_commit_ready [i915])
Nov 16 22:02:51 fedora kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 12:1:859ffffb, in chrome [20329]
Nov 16 22:02:51 fedora kernel: i915 0000:00:02.0: [drm] Resetting chip for stopped heartbeat on rcs0
Nov 16 22:02:51 fedora kernel: i915 0000:00:02.0: [drm] chrome[20329] context reset due to GPU hang
Nov 16 22:02:51 fedora kernel: i915 0000:00:02.0: [drm] GuC firmware i915/adlp_guc_70.1.1.bin version 70.1
Nov 16 22:02:51 fedora kernel: i915 0000:00:02.0: [drm] HuC firmware i915/tgl_huc_7.9.3.bin version 7.9
Nov 16 22:02:51 fedora kernel: i915 0000:00:02.0: [drm] HuC authenticated
Nov 16 22:02:51 fedora kernel: i915 0000:00:02.0: [drm] GuC submission enabled
Nov 16 22:02:51 fedora kernel: i915 0000:00:02.0: [drm] GuC SLPC enabled

To confirm, have you tried @Aggraxis’s fix? It has stopped all of my and many others’ freezes.

Came back to echo, please do try i915.enable_psr=0

It will resolve most freezing issues.

@Nicholas_La_Roux @Matt_Hartley Yes, Aggraxis’s fix was the first thing I tried and unfortunately it did not work for me. Do you know of any other possible solutions?

Sorry, no, I do not. I’ve exclusively used i915.enable_psr=0.

1 Like

We’ve reached out to a contact at Fedora today, we’ll keep at this.

I’m testing a video loop right now (long one), with Fedora 37 GNOME Settings open. Also this is the latest kernel in play as well for Fedora 37, 6.0.8.300.

6 Likes

something i’ve picked up in this thread is different GPU BUG entries in dmesg from various people.
admittedly at this point in time i have no idea what the ecode actually represents but appears that those encountering the specific GPU HANG: ecode 12:0:00000000 in dmesg appear to be fixable by i915.enable_psr=0 method.

backscrolling through this thread i can see a few other different ecodes where psr=0 isnt fixing it for them so maybe there’s multiple or unrelated issues here?

Hate to say it but Extensions have been known to contribute to grpahics issues since the beginning of Gnome 3. In other words give it a try with Extensions off, or at least only the ones installed by default, and see if there are any improvements. Another thing I am curious about is if the people experiencing additional issues are on Wayland or Xorg.

1 Like

Both of these are absolutely fair and correct points. When in doubt, always disable extensions (extra ones installed at least) and then try Xorg as an alternative for testing.

I’d like to add my two cents on this one as well. I just installed Fedora 37 on my 12th gen Framework a few days ago, and have been having hard lock-ups. I’m able to reproduce the problem quite easily by simply opening up Chrome, and scrolling through twitter for 10-15 minutes. If I don’t do this the system is quite stable. I was able to pick out the bits of the journalctl log that happened right as the system locked up.

Nov 19 12:20:57 leeron kernel: Asynchronous wait on fence 0000:00:02.0:gnome-shell[1922]:f706 timed out (hint:intel_atomic_commit_ready [i915])
Nov 19 12:21:01 leeron kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 12:1:859ffffb, in chrome [3333]
Nov 19 12:21:01 leeron kernel: i915 0000:00:02.0: [drm] Resetting chip for stopped heartbeat on rcs0
Nov 19 12:21:01 leeron kernel: i915 0000:00:02.0: [drm] chrome[3333] context reset due to GPU hang
Nov 19 12:21:01 leeron kernel: i915 0000:00:02.0: [drm] GuC firmware i915/adlp_guc_70.1.1.bin version 70.1
Nov 19 12:21:01 leeron kernel: i915 0000:00:02.0: [drm] HuC firmware i915/tgl_huc_7.9.3.bin version 7.9
Nov 19 12:21:01 leeron google-chrome.desktop[3215]: [3333:3333:1119/122101.947418:ERROR:vulkan_swap_chain.cc(403)] vkQueuePresentKHR() failed: -4
Nov 19 12:21:01 leeron google-chrome.desktop[3215]: [3333:3333:1119/122101.948338:ERROR:gpu_service_impl.cc(975)] Exiting GPU process because some >
Nov 19 12:21:01 leeron kernel: i915 0000:00:02.0: [drm] HuC authenticated
Nov 19 12:21:01 leeron kernel: i915 0000:00:02.0: [drm] GuC submission enabled
Nov 19 12:21:01 leeron kernel: i915 0000:00:02.0: [drm] GuC SLPC enabled
Nov 19 12:21:01 leeron google-chrome.desktop[3215]: [3210:3210:1119/122101.971684:ERROR:gpu_process_host.cc(971)] GPU process exited unexpectedly: >
Nov 19 12:21:02 leeron google-chrome.desktop[3215]: libva error: vaGetDriverNameByIndex() failed with unknown libva error, driver_name = (null)
Nov 19 12:21:12 leeron kernel: Asynchronous wait on fence 0000:00:02.0:gnome-shell[1922]:f708 timed out (hint:intel_atomic_commit_ready [i915])
Nov 19 12:21:14 leeron google-chrome.desktop[3215]: [4326:4331:1119/122114.222251:ERROR:vulkan_swap_chain.cc(442)] vkAcquireNextImageKHR() hangs.
Nov 19 12:21:14 leeron google-chrome.desktop[3215]: [4326:4326:1119/122114.223311:ERROR:gpu_service_impl.cc(975)] Exiting GPU process because some >
Nov 19 12:21:14 leeron google-chrome.desktop[3215]: [3210:3210:1119/122114.238021:ERROR:gpu_process_host.cc(971)] GPU process exited unexpectedly: >
Nov 19 12:21:14 leeron google-chrome.desktop[3215]: libva error: vaGetDriverNameByIndex() failed with unknown libva error, driver_name = (null)
Nov 19 12:21:18 leeron google-chrome.desktop[3215]: [4352:4357:1119/122118.256793:ERROR:vulkan_swap_chain.cc(442)] vkAcquireNextImageKHR() hangs.
Nov 19 12:21:18 leeron google-chrome.desktop[3215]: [4352:4352:1119/122118.257591:ERROR:gpu_service_impl.cc(975)] Exiting GPU process because some >
Nov 19 12:21:18 leeron google-chrome.desktop[3215]: [3210:3210:1119/122118.272009:ERROR:gpu_process_host.cc(971)] GPU process exited unexpectedly: >
Nov 19 12:21:18 leeron google-chrome.desktop[3215]: libva error: vaGetDriverNameByIndex() failed with unknown libva error, driver_name = (null)
Nov 19 12:21:18 leeron google-chrome.desktop[3215]: [4380:4380:1119/122118.436316:ERROR:gl_surface_egl.cc(479)] eglCreateWindowSurface failed with >
Nov 19 12:21:32 leeron wpa_supplicant[1214]: wlp166s0: CTRL-EVENT-SIGNAL-CHANGE above=1 signal=-45 noise=9999 txrate=300000
Nov 19 12:21:36 leeron gnome-shell[1922]: libinput error: event8  - PIXA3854:00 093A:0274 Touchpad: kernel bug: Touch jump detected and discarded.
Nov 19 12:21:36 leeron gnome-shell[1922]: See https://wayland.freedesktop.org/libinput/doc/1.21.0/touchpad-jumping-cursors.html for details
Nov 19 12:21:52 leeron systemd-logind[885]: Power key pressed short.

I already added i915.enable_psr=0 to grub, and the issue persists.
This is on stock fedora 37, with only a few gnome extensions installed, fractional scaling enabled (150%), and I assume it’s on wayland (as I did not select Gnome Xorg on the log in screen, just normal Gnome).

I wonder if others have encountered this on other distros?

@Matthew_Mills looks like it might be libva related issue. do you have the gpu acceleration stuff installed?
iirc it’s enable the RPMFusion repo’s, then sudo dnf install libva-intel-hybrid-driver libva-utils. obviously a reboot would be wise. Check vainfo which should show many VLD and EncSlice lines to indicate hw enc/decoding is available.

might be worth disabling chromes hw acceleration too for testing.

i’m using firefox without any issues. i’ve next to no experience with chrome on linux/fedora.
Firefox Hardware acceleration - Fedora Project Wiki might help as there are parts that aren’t firefox specific.

1 Like

I am using Firefox as well without issue. FYI Firefox also comes with hardware acceleration enabled by default.

Seems I also needed libva-intel-driver and intel-media-driver to get vainfo to pass, but it is showing all those VLD, EncSlice lines now. I will do some further testing to see if this has resolved the issue.

Thanks for the quick suggestions!

Update:
Unfortunately the same issue persists, with the same errors showing up in the log. A hard lockup while using chrome. I’m going to simply swap to firefox for a while, just to see if I encounter the same problem.