[TRACKING] Hard freezing on Fedora 36 with the new 12th gen system

I think I may have just experienced the same thing on F37 using VSCode after booting kernel 6.0.8 for the first time (after 6.0.7):

Nov 19 15:44:05 tinframe kernel: Fence expiration time out i915-0000:00:02.0:Xwayland[3459]:11ac!
Nov 19 15:44:07 tinframe systemd[1]: systemd-hostnamed.service: Deactivated successfully.
Nov 19 15:44:07 tinframe audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=syste>
Nov 19 15:44:07 tinframe audit: BPF prog-id=0 op=UNLOAD
Nov 19 15:44:07 tinframe audit: BPF prog-id=0 op=UNLOAD
Nov 19 15:44:07 tinframe audit: BPF prog-id=0 op=UNLOAD
Nov 19 15:44:15 tinframe kernel: Asynchronous wait on fence 0000:00:02.0:gnome-shell[2434]:85b2 timed>
Nov 19 15:44:20 tinframe kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 12:1:85dffffb, in Xwayland >
Nov 19 15:44:20 tinframe kernel: i915 0000:00:02.0: [drm] Resetting chip for stopped heartbeat on rcs0
Nov 19 15:44:20 tinframe kernel: i915 0000:00:02.0: [drm] Xwayland[3459] context reset due to GPU hang
Nov 19 15:44:20 tinframe kernel: i915 0000:00:02.0: [drm] GuC firmware i915/adlp_guc_70.1.1.bin versi>
Nov 19 15:44:20 tinframe kernel: i915 0000:00:02.0: [drm] HuC firmware i915/tgl_huc_7.9.3.bin version>
Nov 19 15:44:20 tinframe kernel: i915 0000:00:02.0: [drm] HuC authenticated
Nov 19 15:44:20 tinframe kernel: i915 0000:00:02.0: [drm] GuC submission enabled
Nov 19 15:44:20 tinframe kernel: i915 0000:00:02.0: [drm] GuC SLPC enabled

System just sat frozen in place for like 5s. Man, I had just finished spending the day setting up Fedora and I see I may have been naive thinking I wouldnā€™t run into bleeding edge kernel issues. Fedora seems nice but I might reconsider taking a more stable road now.

In Linux when you have a fairly new processor there is rarely anything such as stable. The fastest way to stability is through a frequently updated distro where new kernels are pushed. If this one is causing issues simply choose to use the last kerenlthat was not giving you an issue. My guess is 6.0.9 will be landing in fedora this week and the changelog has a ton of psr related fixes.

2 Likes

You can also test that 6.0.9 kernel now if you are brave with:

sudo dnf --enablerepo=updates-testing update kernel

Of course, you shouldnā€™t do that unless you know what you are doing, and are ready for things to possibly get worse.

I would echo aiming for 6.0.9, but please avoid testing repo. As mentioned previously, it may contribute to additional issues.

For the sake of duplication, 12th gen is affected for everyone seeing this, correct?

Also, am I correct in seeing this happening with Chrome and Vscode (based on Chromium) appear to be our common thread here, correct?

This gives me an opportunity to replicate.

Thanks

All of this is very reminiscent of this GPU hang on transition to idle (#673) Ā· Issues Ā· drm / intel Ā· GitLab some of this may very well be a regression.

1 Like

@Matt_Hartley Correct. Iā€™m on a 12th gen system, and have only encountered the issue when using Chrome thus far.

1 Like

I have no idea what is up here, but I thought that I should give you some notes on my KUbuntu 22.04 system that doesnā€™t show the issue with i915.enable_psr=0. I use VSCode, Brave and/or Chrome all day and Iā€™ve never seen this issue.

  1. Iā€™m running X11, not wayland.
  2. Iā€™m running kernel 5.19.17.
  3. vainfo:
libva info: VA-API version 1.14.0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_1_14
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.14 (libva 2.12.0)
vainfo: Driver version: Intel iHD driver for Intel(R) Gen Graphics - 22.3.1 ()
vainfo: Supported profile and entrypoints
      VAProfileNone                   : VAEntrypointVideoProc
      VAProfileNone                   : VAEntrypointStats
      VAProfileMPEG2Simple            : VAEntrypointVLD
      VAProfileMPEG2Main              : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointEncSliceLP
      VAProfileH264High               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointEncSliceLP
      VAProfileJPEGBaseline           : VAEntrypointVLD
      VAProfileJPEGBaseline           : VAEntrypointEncPicture
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointEncSliceLP
      VAProfileVP8Version0_3          : VAEntrypointVLD
      VAProfileHEVCMain               : VAEntrypointVLD
      VAProfileHEVCMain               : VAEntrypointEncSliceLP
      VAProfileHEVCMain10             : VAEntrypointVLD
      VAProfileHEVCMain10             : VAEntrypointEncSliceLP
      VAProfileVP9Profile0            : VAEntrypointVLD
      VAProfileVP9Profile0            : VAEntrypointEncSliceLP
      VAProfileVP9Profile1            : VAEntrypointVLD
      VAProfileVP9Profile1            : VAEntrypointEncSliceLP
      VAProfileVP9Profile2            : VAEntrypointVLD
      VAProfileVP9Profile2            : VAEntrypointEncSliceLP
      VAProfileVP9Profile3            : VAEntrypointVLD
      VAProfileVP9Profile3            : VAEntrypointEncSliceLP
      VAProfileHEVCMain12             : VAEntrypointVLD
      VAProfileHEVCMain422_10         : VAEntrypointVLD
      VAProfileHEVCMain422_12         : VAEntrypointVLD
      VAProfileHEVCMain444            : VAEntrypointVLD
      VAProfileHEVCMain444            : VAEntrypointEncSliceLP
      VAProfileHEVCMain444_10         : VAEntrypointVLD
      VAProfileHEVCMain444_10         : VAEntrypointEncSliceLP
      VAProfileHEVCMain444_12         : VAEntrypointVLD
      VAProfileHEVCSccMain            : VAEntrypointVLD
      VAProfileHEVCSccMain            : VAEntrypointEncSliceLP
      VAProfileHEVCSccMain10          : VAEntrypointVLD
      VAProfileHEVCSccMain10          : VAEntrypointEncSliceLP
      VAProfileHEVCSccMain444         : VAEntrypointVLD
      VAProfileHEVCSccMain444         : VAEntrypointEncSliceLP
      VAProfileAV1Profile0            : VAEntrypointVLD
      VAProfileHEVCSccMain444_10      : VAEntrypointVLD
      VAProfileHEVCSccMain444_10      : VAEntrypointEncSliceLP
1 Like

there might be more than one issue in this thread.
when i encountered it (where psr=0 fixes), i dont use/haved started those apps
firefox user; chom{e|ium} not installed.
Yes to vscode, using flatpak version but had more crashes than when vscode has been started.

i was specifically getting that ecode 12:0:00000000 (or however many 0ā€™s it is; 12+all zeroes!). most commonly seen after resume from sleep. iā€™d sometimes get it after an hour or 2 after using the laptop.

iā€™ve nothing important on this laptop and bodhi seems to be sugesting itā€™s stableā€¦ iā€™ll give 6.0.9 a whirl nowā€¦

This, exactly. Some early thoughts.

  • It appears to be an issue with Chrome for some users, but not others. @PDXTabs gave an outstanding account of their setup. To help further, please follow their example:

  • On Ubuntu, sudo apt install vainfo (this will vary on other distros), then run

vainfo

and share the output.

  • X11 or Wayland?

  • Kernel version and distro/version.

Thanks

F37 running the 6.0.9 from testing repoā€™s (6.0.9 is now marked as stable on Fedora) has been working well so far. psr=0 has been removed from kernel args. While itā€™s still early days, mutiple resume-from-sleep hasnā€™t triggered a crash. They were frequent enough for me that i shouldā€™ve had a few by now.

If i see any further issues iā€™ll report back with the requested info.

2 Likes

Please do. Working on my Fedora 37 installation as I type this. Did some suspend testing last night with excellent results.

still good here after 24h.

although i am seeing a few new issues. No obvious symptoms in usability with these appearing.

$ dmesg -T | grep -i "drm] \*error\*"
[Tue Nov 22 16:31:40 2022] i915 0000:00:02.0: [drm] *ERROR* CPU pipe A FIFO underrun: port,transcoder,
[Wed Nov 23 14:02:24 2022] i915 0000:00:02.0: [drm] *ERROR* [ENCODER:275:DDI TC4/PHY TC4][DPRX] Failed to enable link training

both hinting towards a en/decoder problem which iā€™ve not seen before.

1 Like

Definitely keep an eye on it. Thanks

6.0.9 is now generally available. Just upgraded to it via GNOME Software and removed the psr=0 config. Testing now. So far so good.

EDIT: After several hours of Chrome, VS Code, Spotify, Steam, a few suspends, and a clamshell session with peripherals over Thunderbolt, I still havenā€™t hit a single freeze. Feels like 6.0.9 may just be the fix weā€™ve been hoping for.

EDIT EDIT: Surprisingly, desite several hours of varied usage earlier, jumped back on and am experiencing the freezes again. This issue is less predicatble than I previously thought. :frowning:

@Nicholas_La_Roux my recommendation is upgrade to Fedora 37. The kernel does resolve a bunch of items that were cropping up, however based on what I have seen the freezes are also related to XWayland, and possibly GTK4, and how different calls are being made. The kernel is a step in the right direction but without those underlying items also receiving the latest I donā€™t think you are going to see the full benefit.

1 Like

I see and thank you but Iā€™m already on Fedora 37 and have been for about 2 months at this point. :disappointed:

1 Like

After testing a bit with kernel 6.0.9, the PSR fixes do improve/eliminate stuttering and major frame paint delays, but the i915 GPU hangs are completely unresolved.

To add some extra context, the hard freezes may not be linked just to chromium-based applications. I have played a few various Steam games (Barotrauma, Parkitect, Deep Rock Galactic) and all listed have been able to get lockups (with Parkitect tested for continued lockups on kernel 6.0.9). These apps (at least with Steam overlay enabled) and with background apps (namely Firefox, sometimes Blender, sometimes VSCodium) have been able to send the system into a post-i915 hang state.

Occasionally, depending on what crashes and if DRM can reset the display properly, the system might still be functional. In most cases though, the following will occur:

i915 hangs

kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 12:1:86cdffff, in Parkitect.x86_6 [4828]

followed by DRM failing to reset the GPU

kernel: i915 0000:00:02.0: [drm] Resetting chip for stopped heartbeat on rcs0
kernel: i915 0000:00:02.0: [drm] *ERROR* Failed to reset chip
kernel: i915 0000:00:02.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by intel_gt_reset+0x23a/0x2a0 [i915]
kernel: [drm:__uc_sanitize [i915]] *ERROR* Failed to reset GuC, ret = -110

At which point, depending on which apps end up crashing will leave the system in various states. Namely, I have noticed if SDDM crashes, input control is regained and the system TTY sessions may be used, but ability to use graphical Xorg or Wayland sessions cannot be recovered/restarted during the boot.

If SDDM doesnā€™t crash, the system will outright keep hold of all input devices and while things like pipewire still function, the particular boot of the system will be left in lockup without any remote management available (ssh) to forcibly kill any stuck processes.

System information:
Framework 12th gen i5 1240p
Fedora 37
KDE Desktop (Wayland) with xwayland support enabled
SDDM (Xorg)
kernel 6.0.9-300.fc37.x86_64
special boot parameters: module_blacklist=hid_sensor_hub

2 Likes

I arrived here from Google and have a dell XPS 15 12th gen and upgraded to Fedora 37 and updated to 6.0.9-300 kernel and still experience random freezes on Wayland with the Intel driver. Definitely not just a framework issue. Bummer itā€™s still not fixed in the newest kernel. Can usually get it on gnome settings after a while or Firefox + gnome settings. I do not have anything chromium or chrome based installed (no vscode Spotify or electron anything at the moment) so I donā€™t think itā€™s that either

1 Like

Update 29/11/2022.
Happened again at gnome settings. Fedora 37 with 6.0.9-200.fc36.x86_64 kernel.

1 Like

STILL good for me with F37 KDE & no psr set in kernel args. Not a single issue with GPU BUG ecode 12:0:00000000 and using the laptop daily for multiple hours.
Without a way to reproduce on my hardware to confirm, seems like thereā€™s something going on with the chrom{e|ium} libraries or gnome specific, which is also generating those different ecode values.

Gaming; flatpak Steam version, Quake1 was faultlessly reliable
non-gaming; general firefox browsing including youtube, browse-based emby playback, VLC, flatpak freecad & superslicer all running well.