[TRACKING] Hard freezing on Fedora 36 with the new 12th gen system

I see, sad. Maybe uninstalling this just makes much more unlikely to hit a problem deeper in the driver stack…

edit: Actually I can confirm, I uninstalled xf86-video-intel after I read the thread here, and now saw the issue within hours (and just once in a few weeks with the package installed). I may try reinstalling it and see if this actually makes things better here.

1 Like

Since I haven’t had a crash yet since, I wanted to add something: I actually managed to unfreeze the system while doing the sysrq reisub sequence but I’m a bit confused as to what actually happened.
I first did the sequence without holding fn, because I have the function keys set to fx per default. When this didn’t do anything, I tried doing the sequence while holding fn, and after pressing prtscr+alt+fn+e (If I remember correctly), the laptop suddenly unfroze and loaded into the login screen. When I logged in, all applications were closed (as to be expected), but it also closed things like the wifi service which should normally be running on boot. Since I don’t know which processes needed to be launched and my applications were closed anyway, I rebooted my system.

I just noticed that on these crash logs, this line keeps showing up:
kernel: i915 0000:00:02.0: [drm] HuC firmware i915/tgl_huc_7.9.3.bin version 7.9
The name of the binary file tgl_huc_7.9.3.bin implies Tiger Lake, whereas the GuC line explicitly had “adlp” in the file name (adlp_guc_70.1.1.bin)… which makes me wonder if HuC is something that should actually be disabled for Alder Lake P.

The Arch wiki article for Intel graphics has this to say:


which makes me think it might be a potential suspect, or even the culprit in this case. I’m going to disable just HuC and see what happens.

I’m not sure why that would be causing the issue though, since I’m not using hardware HEVC decoding (and HuC is the HEVC “microcontroller”), but perhaps it hasn’t been updated to support ADLP? Or perhaps it didn’t need an update since ADLP inherits its graphics from TGL since nothing has overtly changed about Iris Xe between the two generations?

1 Like

I’m pretty sure this was not an accident:
https://patchwork.kernel.org/project/intel-gfx/patch/20210325180720.401410-38-matthew.d.roper@intel.com/

At this point, it may be best to reach out to i915 kernel folks.

Thanks for the speedy reply. Browsing the git tree for both my current kernel as well as the upcoming 6.0-rc4, it seems like they’re still using the same microcode definitions for tgl and adlp, so tgl_huc_7.9.3.bin is the correct module.

Variables eliminated so far:

  • removing wrong userspace driver (xf86-video-intel on Arch, xorg-x11-drv-intel on Fedora) does not solve the freeze (thanks @real_or_random for the additional data point)
  • Any combination of the tweaking the following kernel parameters do not solve the freeze:
    • i915.enable_psr
    • i915.request_timeout_ms
    • nvme.noacpi
    • module_blacklist=hid_sensor_hub (including modprobe.d config variant)
  • Fractional scaling
  • Integer scaling
  • disabling HuC or GuC

The crash almost always seems to be triggered by the gnome-control-center app, or less commonly some settings dialog within Gnome running on Wayland, while another application is either playing music or using xwayland.

3 Likes

What makes you think that this patch in particular is relevant? I applied it to my kernel and it didn’t seem to fix the GPU hangs I’d encountered.

A whole bunch of firmware updates were just released.

linux-firmware: Firmware for Linux kernel drivers

==============================================================================================================================================================================================
 Package                                                 Architecture                         Version                                             Repository                             Size
==============================================================================================================================================================================================
Upgrading:
 iwl100-firmware                                         noarch                               39.31.5.1-138.fc36                                  updates                               140 k
 iwl1000-firmware                                        noarch                               1:39.31.5.1-138.fc36                                updates                               251 k
 iwl105-firmware                                         noarch                               18.168.6.1-138.fc36                                 updates                               219 k
 iwl135-firmware                                         noarch                               18.168.6.1-138.fc36                                 updates                               228 k
 iwl2000-firmware                                        noarch                               18.168.6.1-138.fc36                                 updates                               221 k
 iwl2030-firmware                                        noarch                               18.168.6.1-138.fc36                                 updates                               230 k
 iwl3160-firmware                                        noarch                               1:25.30.13.0-138.fc36                               updates                               992 k
 iwl3945-firmware                                        noarch                               15.32.2.9-138.fc36                                  updates                                81 k
 iwl4965-firmware                                        noarch                               228.61.2.24-138.fc36                                updates                                94 k
 iwl5000-firmware                                        noarch                               8.83.5.1_1-138.fc36                                 updates                               364 k
 iwl5150-firmware                                        noarch                               8.24.2.2-138.fc36                                   updates                               137 k
 iwl6000-firmware                                        noarch                               9.221.4.1-138.fc36                                  updates                               156 k
 iwl6000g2a-firmware                                     noarch                               18.168.6.1-138.fc36                                 updates                               336 k
 iwl6000g2b-firmware                                     noarch                               18.168.6.1-138.fc36                                 updates                               343 k
 iwl6050-firmware                                        noarch                               41.28.5.1-138.fc36                                  updates                               295 k
 iwl7260-firmware                                        noarch                               1:25.30.13.0-138.fc36                               updates                               9.5 M
 iwlax2xx-firmware                                       noarch                               20220815-138.fc36                                   updates                                45 M
 libertas-usb8388-firmware                               noarch                               2:20220815-138.fc36                                 updates                               105 k
 linux-firmware                                          noarch                               20220815-138.fc36                                   updates                               177 M
 linux-firmware-whence                                   noarch                               20220815-138.fc36                                   updates                                52 k
Installing weak dependencies:
 amd-gpu-firmware                                        noarch                               20220815-138.fc36                                   updates                                14 M
 intel-gpu-firmware                                      noarch                               20220815-138.fc36                                   updates                               7.1 M
 nvidia-gpu-firmware                                     noarch                               20220815-138.fc36                                   updates                               1.2 M

Transaction Summary
==============================================================================================================================================================================================
Install   3 Packages
Upgrade  20 Packages

Total download size: 258 M

Interested to see if they help at all, given my Framework 12th gen was just delivered.

I don’t think so. Arch has the relevant firmware files already for a while.

I still think that it’s best to report this to the intel-gfx kernel maintainers, but probably it should be done by someone who can reproduce the problem often and is willing to run a vanilla kernel, see Reporting issues — The Linux Kernel documentation

1 Like

I have experienced this the last few days on Fedora 36 and I still seeing if i can get any logs. I experience a hard lockup and audio stops as well as video. After rebooting I don’t see any unusual activity in the previous boot logs using journalctl --boot -1
Kernel:
5.19.6-200.fc36.x86_64

However it seems to always happen to me when I am in a google meet video call.

2 Likes

I too am having freezing issues with my newly upgraded 12th Gen i7-1260P with 16GB and a 2TB WD SN750. I noticed a comment from @Paul_Sorensen that mirrors the same problem I am having now. I also am using Windows 11 and my laptop just randomly freezes and then shuts off. I have run multiple tests on the hardware and it never seems to freeze during the testing. I have completely wiped and reinstalled the OS, no luck. I’m hoping for a solution soon as the laptop is not reliable in its current condition. It was doing so well at first!! :grin:

1 Like

I also ran memtest86 and it passed all tests. Finally, I swapped the RAM module with my framework and my wife’s, and hers was still freezing. Then I swapped her SSD with my SSD and the one running her SSD still froze. However, there have been two times on the machine running my SSD that powered off and rebooted while I wasn’t there to observe if it frozen or not. I can see in the Windows Event Viewer that there was a Kernel-Power 41 error which I also observed are logged after it boots back up after freezing. So I suspect my computer is doing it too, but I haven’t witnessed it on mine.

Guess I’ll hold off on the upgrade for now then.

I too really hope this isn’t as bad as it sounds. Going to be my only computer for the next month starting today.

That said, now that I have one, I can actually jump into debugging, seeing what I can find. Assuming I hit the same issues.

This occurred again for me, so I had another go at digging into it…

I was able to get output from journalctl and dmesg by piping output to a remote server over netcat; log output is similar to what was posted before (gnome-settings open, playing audio and manipulating touchpad settings eventually caused a crash here):

[ 1043.589794] Asynchronous wait on fence 0000:00:02.0:gnome-shell[2701]:4632 timed out (hint:intel_atomic_commit_ready [i915])
[ 1047.464971] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:1:0020fdfe, in gnome-control-c [5847]
[ 1047.465011] i915 0000:00:02.0: [drm] Resetting chip for stopped heartbeat on rcs0
[ 1047.567662] i915 0000:00:02.0: [drm] ERROR rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
[ 1047.568369] i915 0000:00:02.0: [drm] ERROR rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
[ 1047.568453] i915 0000:00:02.0: [drm] gnome-control-c[5847] context reset due to GPU hang
[ 1047.568507] i915 0000:00:02.0: [drm] GuC firmware i915/adlp_guc_70.1.1.bin version 70.1
[ 1047.568509] i915 0000:00:02.0: [drm] HuC firmware i915/tgl_huc_7.9.3.bin version 7.9
[ 1047.583819] i915 0000:00:02.0: [drm] HuC authenticated
[ 1047.584864] i915 0000:00:02.0: [drm] GuC submission enabled
[ 1047.584873] i915 0000:00:02.0: [drm] GuC SLPC enabled

The network stays up, so enabling sshd beforehand allows access from another system. Shelling in and sending SIGKILL to the gnome-shell process kicks the desktop back to the login prompt, without the need to hard power cycle. HTH

I’m bummed and throwing in the towel. My small business needs working systems. This really sucks and I’m sad – I had high hopes for the framework because I love the DIY and repairability promise offsetting some of the initial capex. Even with a new mainboard sent from Framework to look at one of our that have persistent freezing, freezing continues to worsen over time. My employees are complaining about thermal management being a major issue for them as well.

This isn’t a Framework laptop issue, but a Linux kernel driver issue, and will be present on every 12th gen computer that uses the igpu.

If you don’t want to hit it, don’t use Linux on any 12th gen Intel system that doesn’t have a discrete GPU.

3 Likes

Brand new DIY i7-1260P, Arch install, Gnome 42.4, Wayland (no XWayland), Hynix P31 2TB, Crucial 2x16GB RAM (from the approved list). I’m also having lockups in gnome settings.

5.19.7-arch1-1, #1 SMP PREEMPT_DYNAMIC Mon, 05 Sep 2022 18:09:09 +0000

GRUB_CMDLINE_LINUX=“cryptdevice=UUID=e1fb5806-1f0a-4edb-bbd4-855e2a6a4c2e:cryptroot:allow-discards root=/dev/mapper/cryptroot resume=UUID=967555c6-1617-4dd2-acd7-207a79a74dc5 resume_offset=192020480”

I saw the lockup when I had Plexamp AppImage installed and running. I was also in the TouchPad settings portion of the menu when this happened.

So this may not apply to everyone’s system, I have a Windows 11 setup. However, i found another thread that dealt with an issue with the DisplayPort and HDMI expansion cards which dealt with excessive sleep power consumption, [Beta] DisplayPort Expansion Card firmware update to reduce system power consumption - Framework Laptop / DIY Edition - Framework Community.

On a hunch I removed my DP card and have been now going almost 24 hours without a freeze/shutdown, whereas before the error would occur before every 3-5 hours. I’m still holding my breath on this this though. BTW, removing the DP card also seems to have corrected a lag I was having when in the UEFI/BIOS where I would get these pauses while scrolling through the menus.

Can anyone else check if removing their DP or HDMI card would make a difference on the stability of their laptop? Thanks. @Paul_Sorensen

1 Like

This may be it! My wife has a Displayport card in hers and I don’t in mine, that’s the only difference between our laptops. I’ll try swapping them and see if mine starts having the freezing behavior.

1 Like

On the point above, I have an HDMI card in mine and have so far only experienced a single freeze. That freeze occurred while in GNOME settings within the first hour or so after Fedora installation. If I am able to reproduce the freeze I will try without the HDMI card installed to see if that makes any difference.

1 Like