[TRACKING] Hard freezing on Fedora 36 with the new 12th gen system

Interesting why is it trying to connect to Touchegg? Isn’t Touchegg Xorg only?

@nadb Oh you can ignore touchegg, I had it installed cause I was testing X11 session to see if it behaved better, and I guess I didn’t fully disable its backend before starting this Wayland session. I got the exact same crash before installing touchegg, so I highly doubt that’s part of the problem.

I got that error over and over for the entire couple hours prior to the crash in the log. The timing just looks suspicious :slight_smile:

Try it with the following?

add “Nomodeset” to the grub cmdline to check if the intel drivers are at fault.

Had this issue on Debian Bookworm, specifically when using Gnome Settings. I added psr=0 to GRUB_CMDLINE_LINUX_DEFAULT at /etc/default/grub then sudo update-grub and I haven’t experienced hard freezes ever since.

Just got the same crash in X11. This honestly is kind of a showstopper bug for me. Might have to switch to using Windows on this thing (I’ve tried Fedora Gnome and it was much worse than KDE, and I doubt changing distros is gonna fix anything, it’s the same underlying stuff)

Logs:

Feb 18 23:06:15 macaria kernel: Asynchronous wait on fence 0000:00:02.0:kwin_x11[2193]:b0f72 timed out (hint:intel_atomic_commit_ready [i915])
Feb 18 23:06:18 macaria kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 12:1:84dffffb, in CorporateClash. [16280]
Feb 18 23:06:18 macaria kernel: i915 0000:00:02.0: [drm] Resetting chip for stopped heartbeat on rcs0
Feb 18 23:06:18 macaria kernel: i915 0000:00:02.0: [drm] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
Feb 18 23:06:18 macaria kernel: i915 0000:00:02.0: [drm] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
Feb 18 23:06:18 macaria kernel: i915 0000:00:02.0: [drm] CorporateClash.[16280] context reset due to GPU hang
Feb 18 23:06:18 macaria kernel: i915 0000:00:02.0: [drm] GuC firmware i915/adlp_guc_70.bin version 70.5.1
Feb 18 23:06:18 macaria kernel: i915 0000:00:02.0: [drm] HuC firmware i915/tgl_huc.bin version 7.9.3
Feb 18 23:06:18 macaria kernel: i915 0000:00:02.0: [drm] HuC authenticated
Feb 18 23:06:18 macaria kernel: i915 0000:00:02.0: [drm] GuC submission enabled
Feb 18 23:06:18 macaria kernel: i915 0000:00:02.0: [drm] GuC SLPC enabled
Feb 18 23:06:18 macaria ksmserver[2986]: Crash Annotation GraphicsCriticalError: |[0][GFX1-]: glxtest: VA-API test failed: failed to initialise VAAPI connection. (t=0.521279) |[1][GFX>
Feb 18 23:06:18 macaria kwin_x11[2193]: kwin_scene_opengl: A graphics reset not attributable to the current GL context occurred.
Feb 18 23:06:18 macaria kwin_x11[2193]: OpenGL vendor string:                   Intel
Feb 18 23:06:18 macaria kwin_x11[2193]: OpenGL renderer string:                 Mesa Intel(R) Graphics (ADL GT2)
Feb 18 23:06:18 macaria kwin_x11[2193]: OpenGL version string:                  4.6 (Compatibility Profile) Mesa 22.3.5
Feb 18 23:06:18 macaria kwin_x11[2193]: OpenGL shading language version string: 4.60
Feb 18 23:06:18 macaria kwin_x11[2193]: Driver:                                 Intel
Feb 18 23:06:18 macaria kwin_x11[2193]: GPU class:                              Unknown
Feb 18 23:06:18 macaria kwin_x11[2193]: OpenGL version:                         4.6
Feb 18 23:06:18 macaria kwin_x11[2193]: GLSL version:                           4.60
Feb 18 23:06:18 macaria kwin_x11[2193]: Mesa version:                           22.3.5
Feb 18 23:06:18 macaria kwin_x11[2193]: X server version:                       1.20.14
Feb 18 23:06:18 macaria kwin_x11[2193]: Linux kernel version:                   6.1.11
Feb 18 23:06:18 macaria kwin_x11[2193]: Requires strict binding:                yes
Feb 18 23:06:18 macaria kwin_x11[2193]: GLSL shaders:                           yes
Feb 18 23:06:18 macaria kwin_x11[2193]: Texture NPOT support:                   yes
Feb 18 23:06:18 macaria kwin_x11[2193]: Virtual Machine:                        no
Feb 18 23:06:29 macaria kernel: Asynchronous wait on fence 0000:00:02.0:Xorg[1490]:62374a timed out (hint:intel_atomic_commit_ready [i915])

I can try that next just to try to troubleshoot this bug before I jump ship to Windows… I’m not sure what that would do here, though. Wouldn’t that only affect graphics while booting?

Edit: I’ve decided to give Fedora Gnome another try, now that the psr=0 tweak is so official. If I still get crashes of a similar nature, I can only assume it’s a bug in the Intel graphics drivers (a bug which appears to have maybe been seen in posts through out the last 5 years, which is perplexing)

I would also upgrade to Fedora 37. No kernel parameters, no hard freezes since December 26th. Before that it was weekly.

2 Likes

This issue used to occur to me almost every 30min to an hour. I noticed that almost all the time I would face GPU hangs on the newest kernel and Mesa were related to gnome-control-center in journalctl. I found that switching from Gnome to Hyprland has resolved all issues for me (it has been about 2 weeks of heavy usage with no GPU hangs). Maybe, Hyprland just isn’t triggering the issue in the same way Gnome and KDE are? In case this info is useful for yall, here are my kernel info and mesa version:

Kernel Version

6.1.12-arch1-1 #1 SMP PREEMPT_DYNAMIC x86_64 GNU/Linux

Kernel Parameters

root=PARTUUID=[Root UUID] zswap.enabled=0 rootflags=subvol=@ rw rootfstype=btrfs

Mesa version

22.3.5

This. This if the first thing to try. Then if that doesn’t let us know. I’d lose the additional parameters for testing unless they came provide by default.

Yeah, all the crashes I’ve posted about have been on Fedora 37. Granted, it was all under the KDE spin, so maybe this is somehow a problem with KDE’s window manager… Stinks cuz I much prefer KDE. But I can live with Gnome over Windows

Fedora tends to focus more on GNOME as it’s their official environment.

Now on stock latest Fedora 37 (gnome)

Getting some hangs that last 2-5 seconds, with these logs:

Feb 24 21:19:03 macaria kernel: Asynchronous wait on fence 0000:00:02.0:gnome-shell[1899]:46dfe timed out (hint:intel_atomic_commit_ready [i915])
Feb 24 21:19:07 macaria kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 12:0:00000000
Feb 24 21:19:07 macaria kernel: i915 0000:00:02.0: [drm] Resetting chip for stopped heartbeat on rcs0
Feb 24 21:19:07 macaria kernel: i915 0000:00:02.0: [drm] GuC firmware i915/adlp_guc_70.bin version 70.5.1
Feb 24 21:19:07 macaria kernel: i915 0000:00:02.0: [drm] HuC firmware i915/tgl_huc.bin version 7.9.3
Feb 24 21:19:07 macaria kernel: i915 0000:00:02.0: [drm] HuC authenticated
Feb 24 21:19:07 macaria kernel: i915 0000:00:02.0: [drm] GuC submission enabled
Feb 24 21:19:07 macaria kernel: i915 0000:00:02.0: [drm] GuC SLPC enabled

Some of them don’t include the i915 message, but they all show GPU HANG. I wasn’t even playing any games this time, just running apps and streaming a remote desktop. Something’s definitely still up.

[gabe@macaria ~]$ cat /etc/default/grub 
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="rhgb quiet nvme.noacpi=1 psr=0 module_blacklist=hid_sensor_hub"
GRUB_DISABLE_RECOVERY="true"
GRUB_ENABLE_BLSCFG=true

I have psr=0 in my grub, is this maybe causing the problem now?

That is more likely causing the issue.

@nadb Sure, I can remove that. Got it from the Framework Fedora 37 install recs Fedora 37 Installation on the Framework Laptop - Framework Guides

Considering it mentions that flag is a way to save energy with NVME drives, it’s definitely possible it’s causing hangs. As great as power saving stuff is, it can also cause various kinds issues, especially hangs such as with WiFi and graphics.

# Improve power saving for NVMe drives:
sudo grubby --update-kernel=ALL --args="nvme.noacpi=1"

Anecdotally, I have the psr=0 flag but no acpi flag and I never hit flags these days. (kernel: 6.1.13)

New GPU HANG just dropped

Feb 25 19:23:55 macaria kernel: Asynchronous wait on fence 0000:00:02.0:Xwayland[2975]:92132 timed out (hint:intel_atomic_commit_ready [i915])
Feb 25 19:23:58 macaria kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 12:0:00000000
Feb 25 19:23:58 macaria kernel: i915 0000:00:02.0: [drm] Resetting chip for stopped heartbeat on rcs0
Feb 25 19:23:58 macaria kernel: i915 0000:00:02.0: [drm] GuC firmware i915/adlp_guc_70.bin version 70.5.1
Feb 25 19:23:58 macaria kernel: i915 0000:00:02.0: [drm] HuC firmware i915/tgl_huc.bin version 7.9.3
Feb 25 19:23:58 macaria kernel: i915 0000:00:02.0: [drm] HuC authenticated
Feb 25 19:23:58 macaria kernel: i915 0000:00:02.0: [drm] GuC submission enabled
Feb 25 19:23:58 macaria kernel: i915 0000:00:02.0: [drm] GuC SLPC enabled
[gabe@macaria ~]$ cat /etc/default/grub 
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="rhgb quiet psr=0 module_blacklist=hid_sensor_hub"
GRUB_DISABLE_RECOVERY="true"
GRUB_ENABLE_BLSCFG=true

Programs running:
Telegram, Firefox, NoMachine (as a client, streaming another computer to me). I doubt it has anything to do with the programs, but thought I’d mention just in case. Guess I’ll turn off psr=0 and try again, and maybe try the other kernel param variant that disables psr.

Appending to my previous post (390). This issue does not seem to be KDE-specific. I had done a couple days of testing while developing applications with Android Studio and QEMU VMs within pure Weston (with XWayland enabled), and can definitely confirm that the GPU Hangs persist in Weston. Notably, there do seem to be a lesser frequency of them (presumably due to the simple, barebones environment) and that I have not a hang actually properly lockup weston permanently in comparison to kwin_wayland.

With Android Studio and IntelliJ IDEA, I notice that the most GPU Hangs occur when some kind of sub-window is being spawned (autocompletion, Alt+Enter actions, warnings, etc.) All of this was done with psr=0 set.

I do think that some proper time needs to be spent looking into this for more than only GNOME rather than dismissing this as a KDE-specific issue, given that the reference implementation for a Wayland compositor with only XWayland support enabled exhibits largely the same behavior.

Did you check the settings of the i915 module? I am not sure if this is the right way to set the module parameters.
The Arch wiki states: “If the module is built into the kernel, you can also pass options to the module using the kernel command line.”
Therefore I am not sure if psr=0 sets enable_psr in the i915 module.

For general reference, in case peeps haven’t see this older post above, I always used @Aggraxis’s method up here and it worked flawlessly.

That is the method I used on Arch. On Arch you also have to regenerate initramfs and check that the config file is included. Things may be different on other distributions. My advice: check the module setting on a running system (sudo systool -v -m i915).

As a general rule, start as basic as possible then slowly add to it.

  • So in your case, remove all added parameters. We include them in the guide as we have them tested working and proving their indented benefits. But, removing them and trying each application by itself allows us to track this down.

  • With the extra boot parameters removed, try one of those applications at a time. Freezing? Nope, add one more. Freezing? Yup? Then we have something to point to for further troubleshooting.

1 Like