Thanks for the speedy reply. Browsing the git tree for both my current kernel as well as the upcoming 6.0-rc4, it seems like they’re still using the same microcode definitions for tgl and adlp, so tgl_huc_7.9.3.bin is the correct module.
Variables eliminated so far:
removing wrong userspace driver (xf86-video-intel on Arch, xorg-x11-drv-intel on Fedora) does not solve the freeze (thanks @real_or_random for the additional data point)
Any combination of the tweaking the following kernel parameters do not solve the freeze:
The crash almost always seems to be triggered by the gnome-control-center app, or less commonly some settings dialog within Gnome running on Wayland, while another application is either playing music or using xwayland.
I don’t think so. Arch has the relevant firmware files already for a while.
I still think that it’s best to report this to the intel-gfx kernel maintainers, but probably it should be done by someone who can reproduce the problem often and is willing to run a vanilla kernel, see Reporting issues — The Linux Kernel documentation
I have experienced this the last few days on Fedora 36 and I still seeing if i can get any logs. I experience a hard lockup and audio stops as well as video. After rebooting I don’t see any unusual activity in the previous boot logs using journalctl --boot -1
Kernel:
5.19.6-200.fc36.x86_64
However it seems to always happen to me when I am in a google meet video call.
I too am having freezing issues with my newly upgraded 12th Gen i7-1260P with 16GB and a 2TB WD SN750. I noticed a comment from @Paul_Sorensen that mirrors the same problem I am having now. I also am using Windows 11 and my laptop just randomly freezes and then shuts off. I have run multiple tests on the hardware and it never seems to freeze during the testing. I have completely wiped and reinstalled the OS, no luck. I’m hoping for a solution soon as the laptop is not reliable in its current condition. It was doing so well at first!!
I also ran memtest86 and it passed all tests. Finally, I swapped the RAM module with my framework and my wife’s, and hers was still freezing. Then I swapped her SSD with my SSD and the one running her SSD still froze. However, there have been two times on the machine running my SSD that powered off and rebooted while I wasn’t there to observe if it frozen or not. I can see in the Windows Event Viewer that there was a Kernel-Power 41 error which I also observed are logged after it boots back up after freezing. So I suspect my computer is doing it too, but I haven’t witnessed it on mine.
This occurred again for me, so I had another go at digging into it…
I was able to get output from journalctl and dmesg by piping output to a remote server over netcat; log output is similar to what was posted before (gnome-settings open, playing audio and manipulating touchpad settings eventually caused a crash here):
The network stays up, so enabling sshd beforehand allows access from another system. Shelling in and sending SIGKILL to the gnome-shell process kicks the desktop back to the login prompt, without the need to hard power cycle. HTH
I’m bummed and throwing in the towel. My small business needs working systems. This really sucks and I’m sad – I had high hopes for the framework because I love the DIY and repairability promise offsetting some of the initial capex. Even with a new mainboard sent from Framework to look at one of our that have persistent freezing, freezing continues to worsen over time. My employees are complaining about thermal management being a major issue for them as well.
Brand new DIY i7-1260P, Arch install, Gnome 42.4, Wayland (no XWayland), Hynix P31 2TB, Crucial 2x16GB RAM (from the approved list). I’m also having lockups in gnome settings.
On a hunch I removed my DP card and have been now going almost 24 hours without a freeze/shutdown, whereas before the error would occur before every 3-5 hours. I’m still holding my breath on this this though. BTW, removing the DP card also seems to have corrected a lag I was having when in the UEFI/BIOS where I would get these pauses while scrolling through the menus.
Can anyone else check if removing their DP or HDMI card would make a difference on the stability of their laptop? Thanks. @Paul_Sorensen
This may be it! My wife has a Displayport card in hers and I don’t in mine, that’s the only difference between our laptops. I’ll try swapping them and see if mine starts having the freezing behavior.
On the point above, I have an HDMI card in mine and have so far only experienced a single freeze. That freeze occurred while in GNOME settings within the first hour or so after Fedora installation. If I am able to reproduce the freeze I will try without the HDMI card installed to see if that makes any difference.
FWIW, I have neither DP or HDMI cards installed and have not had a freeze since I installed fedora over a week ago. Sounds like it might have some merit.
Framework saying “We’re ready for you.”. Linux saying “Not quite yet.”
So this is really an annual event where Linux plays catch up with the hardware…as always for the past 2 decades.
Something needs to change between chip makers / designers and kernel developers working relationship.
Mind you, with that notion, it seems to say “If it’s software, it’s not Framework’s issue”…then where does one draw the line when “[hardware] is optimized for [software]”? Because it seems to say, if it’s not working, it’s software issue. From a general consumer’s perspective…I’m not sure if that’s clear.
You know what? Software has bugs. Complex software has more bugs. Software that interacts with the real world via hardware is even more complex and has even more bugs. Software like the Linux kernel that is supposed to run on every possible combination of every possible piece of hardware is one of the most complex pieces of software on the planet, and it sees a lot of bugs and regressions. But it does admirably well, given how broad its applications are.
So yes, Intel integrated GPU drivers for a newer chipset has problems on Linux, again. I’ve been running Linux-based systems with Intel iGPUs as my sole desktop and server OS for about 25 years, and I’ve seen regressions like this show up a number of times before. But I can assure you it’s not an annual event.
Would you rather Framework have a disclaimer about possible software regressions on their web site under that banner? What about one for kernel subsystems maintainers getting hit by a bus? How about a warning that your Debian stable may be out of date, your Arch could push an update that nukes your filesystem, or your Gentoo CFLAGS may be cooked? Anything else?
Intel already maintains the driver for their iGPUs. Maybe you should go file a bug report with them rather than complaining about it on a forum? It might actually get fixed that way.