So the last couple of Kernel iterations have seen a recurrence of the hard-freezing issues on the 12th Gen 1240P model on Arch Linux. I have tired disabling GUC, PSR, without any major success. The Linux-Clear kernel seems the most stable but it will still freeze any time that the GPU comes under load.
I have tried the mainline kernel, the default kernel and the Linux-Clear Kernel and at the moment all appear to be having this issue with the freezes. I have even had to come back to this topic post several times due to freezing mid-type.
I am using linux-clear 6.1.1 on Arch linux and I did notice some freezes when coming back up from sleep.
More recent version of linux-clear like 6.1.7 and 6.1.10 seems to have a regression on the i915 driver that makes it impossible for me to change the backlight intensity, but maybe that is just because of my config.
Just though to leave it up here in case it might be of some help.
I did notice a comment here about there seeming to be issues with the i915 driver related to the fix to the temporary hangs that were present in earlier kernel versions. The question is I wonder whether it is tied to MESA or to the kernel and if so finding when the regression occurred.
All I can say, the last two/three weeks have rendered my laptop almost useless and that has been deeply problematic.
I am using no mitigations and have not had a hard freeze since December 26th. Before that it was weekly. Keep updating and stay as current as possible. I am on Fedora 37 Gnome i7-1260p defualt Wayland.
I have tried using psr=0, I have tried enable_dc=0 I have tried fiddling with fbc and fastboot and nothing seems to work.
My personal feeling is that there is either a regression in the intel driver or in mesa. Like nadb, I had a period without any errors, but given that I am on Arch, I am always that much closer to the sun. I am likely seeing issues eventuate much more rapidly than others are.
Having looked through the commit logs of the kernel, I am pretty sure that something changed within either linux_frameware or mesa. The last 2/3 mesa updates have proven to be the most problematic, making the hard freezes occur more rapidly. But of course it is an absolute nightmare to try and revert a mesa update without it breaking xorg or wayland.
Except Fedora updates almost as, and in some cases quicker than Arch. It ain’t quite as bleeding edge as it once was. Most distros also don’t roll a pure mainline kernel either, so something backed into the Arch kernels may also be causing it, or something not baked in . Also there are more items in play than just the kernel and drivers. I think the desktop environments compositor is to blame as well in some instances and some actual application level issues at least on Wayland. Especially if the application uses hardware acceleration and something is wrong with the config. Also for reference I am on kernel 6.1.10.
So. I am finding this on 6.1.11, 6.2rc8, Linux-Clear, and the Arch LTS Kernel. This is why I am pretty sure that the issue is not Kernel based at this point. There are just too many different points of testing I have used for it to be happening. I have deliberately used Linux-Clear and Mainline as control elements to see if it was something limited to the Arch Kernel.
My mesa version is 22.3.4-1. I tried reverting to an earlier mesa version, but ended up breaking the links between mesa and X11 and Wayland. That said, while in this broken state I had 0 hangs…
I am having issues on both! I am using both i3 and Sway. I wonder if I should try and remove all the options in my /etc/modprobe.d/i915.config and rebuild the UKIs.
@Anachron there is quite literally nothing in the logs. The last entries I am seeing are the things x86/split lock detection warnings from Steam.
These errors seem to become unrecoverable, especially when I am using a higher cpu frequency. About the only reliable way I have found to reduce their frequency has been to run the computer with limited frequencies in the powersaving governor.
Okay, now this is interesting. For the sake of troubleshooting, did this occur when connected to the module with a display or just simply having the module installed?
I have done further refining, removed a bunch of applications that reset the screen, stopped things that are running in XWayland and have seemed to get a major increase in stability.
So I found that running steam seems to be a pretty big risk factor all things considered. X-Wayland applications themselves also seem to like to cause issues, but not sure which ones yet. Running all my applications that I can in wayland mode seems to have been the best way to be stable.
So my laptop had frozen while I wrote this. It was running steam, but 35 seconds before it froze the clight daemon had reported a compositor error. So I uninstalled that to see if that may help things.