This also happened to me about once a week since I bought the Framework.
Ubuntu 22.04.1
5.15.0-53-generic #59-Ubuntu SMP
module_blacklist=hid_sensor_hub
Samsung SSD 980 PRO 1 TB
Randomly in Firefox, but yesterday I installed Kerbal Space Program (KSP), which seems to reliably cause the laptop to freeze a few minutes after game launch.
Nov 29 20:52:35 kevs-framework kernel: Asynchronous wait on fence 0000:00:02.0:gnome-shell[3148]:cd6e timed out (hint:intel_atomic_commit_ready [i915])
Nov 29 20:52:39 kevs-framework kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 12:1:84dffffb, in KSP.x86_64 [5367]
Nov 29 20:52:39 kevs-framework kernel: i915 0000:00:02.0: [drm] Resetting chip for stopped heartbeat on rcs0
Nov 29 20:52:39 kevs-framework kernel: i915 0000:00:02.0: [drm] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
Nov 29 20:52:39 kevs-framework kernel: i915 0000:00:02.0: [drm] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
Nov 29 20:52:39 kevs-framework kernel: i915 0000:00:02.0: [drm] Renderer[5929] context reset due to GPU hang
Nov 29 20:52:39 kevs-framework kernel: i915 0000:00:02.0: [drm] KSP.x86_64[5367] context reset due to GPU hang
The random hang is usually recoverable by waiting 30-60 seconds, while the one while playing KSP (12:1:84dffffb) is not and requires a forced shutdown and boot.
Looks like you need to update kernel to >=6.0.9. ecode 12:0:00000000 should be resolved once thats done. Not sure if itāll fix KSP, but worth a shot!
just a reminder; weāre running the latest 12th gen intel. Weāre not going to find the required support or bug fixes in old kernels. This was a main driver why i moved to Fedora many years ago; much newer kernels for latest hardware.
Thanks for the help @vhx! Iāve installed kernel 6.0.9 on my Framework (yes, 12th gen) and it seems to have fixed the ecode 12:0:00000000, although Iāll know for sure in a few weeks. As for KSP, Iāve played it yesterday evening for like one or two hours with no freezes!
Whilst I havenāt been having GPU issues under F37 due to my relatively simple usage, seeing the ongoing posts made me think I should mention the debugging resources I had on my list of things to try in case the problem persisted.
Increase the level of logging with additional kernel command line parameters:
@Nicholas_La_Roux - that appears to be normal and part of the bootup (dmesg is showing 1.26 and 3.12 seconds since the last reboot). I get those on my laptop too without any stability issues.
Based on your output, it looks like your freezing isnt generating an ecode, so something else going on imo.
when you next get the freezing, try looking at dmesg (maybe dmesg | tail -n 30 for the most recent 30 lines) to see if that returns anything useful.
@Nicholas_La_Roux nothing that looks abnormal to me. seems sucessful and no problem; no āGPU BUGā or ecode number.
Soā¦ a few things iād look at next to hopefully get more information:
memtest ram.
disabling one of the nvme power management features. i cant remember any specifics but possibly the pcie_aspm=off boot option, but there could be more iām unaware of.
firmware updates. Specifically an update for your nvme drive.
pcie_aspm=off is used for trying to find your issue. It shouldnāt be used as a long term solution since it disables the entire PCIe Active State Power Management system - most likely reduced battery life and possibly slight heat increase.
iād say faulty ram would bring more issues other than resume problems so thats less likely but a typical go-to with system stability issues. at the moment, power management feels more likely based on your resume specific issues.
might be worth asking incase others have similar hardware or know of a fix; what nvme (model & size) and ram (model, size & speed) are you using?
@Matt_Hartley I think youāre correct the hard freezing is a chrome/chromium issue. I currently seem to be able to reproduce the hard freeze by reopening tabs from my last chrome session. Writing this on firefox.
Not sure how to get vainfo on fedora but my setup is:
12th Gen Intel
Fedora 37
Wayland
kernel 6.0.10-300.fc37.x86_64
Logs from my last boot where it hung are:
Dec 04 17:59:17 fedora kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 12:1:849f3c04, in chrome [4627]
Dec 04 17:59:17 fedora kernel: i915 0000:00:02.0: [drm] Resetting chip for stopped heartbeat on rcs0
Dec 04 17:59:17 fedora kernel: i915 0000:00:02.0: [drm] chrome[4627] context reset due to GPU hang
Dec 04 17:59:17 fedora kernel: i915 0000:00:02.0: [drm] GuC firmware i915/adlp_guc_70.1.1.bin version 70.1
Dec 04 17:59:17 fedora kernel: i915 0000:00:02.0: [drm] HuC firmware i915/tgl_huc_7.9.3.bin version 7.9
Dec 04 17:59:17 fedora kernel: i915 0000:00:02.0: [drm] HuC authenticated
Dec 04 17:59:17 fedora kernel: i915 0000:00:02.0: [drm] GuC submission enabled
Here are some logs from where it hung but recovered. Not sure if this gives any extra info but I thought the crash annotation line was interesting as it mentions VAAPI, which I got hits for when googling vainfo
Dec 04 18:49:01 fedora kernel: Asynchronous wait on fence 0000:00:02.0:gnome-shell[1941]:19aac timed out (hint:intel_atomic_commit_ready [>
Dec 04 18:49:04 fedora kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 12:0:00000000
Dec 04 18:49:04 fedora kernel: i915 0000:00:02.0: [drm] Resetting chip for stopped heartbeat on rcs0
Dec 04 18:49:04 fedora kernel: i915 0000:00:02.0: [drm] GuC firmware i915/adlp_guc_70.1.1.bin version 70.1
Dec 04 18:49:04 fedora kernel: i915 0000:00:02.0: [drm] HuC firmware i915/tgl_huc_7.9.3.bin version 7.9
Dec 04 18:49:04 fedora kernel: i915 0000:00:02.0: [drm] HuC authenticated
Dec 04 18:49:04 fedora kernel: i915 0000:00:02.0: [drm] GuC submission enabled
Dec 04 18:49:04 fedora kernel: i915 0000:00:02.0: [drm] GuC SLPC enabled
Dec 04 18:49:04 fedora firefox.desktop[3355]: Crash Annotation GraphicsCriticalError: |[0][GFX1-]: glxtest: VA-API test failed: failed to initialise VAAPI connection. (t=0.236754) |[1][GFX1-]: GFX: RenderThread detected a device reset in PostUpdate (t=2779.62) [GFX1-]: GFX: RenderThread detected a device reset in PostUpdate
your other info is interesting, kernel 6.0.10, but seeing the ecode 12:0:00000000 error. iāve not seen it since i upgraded to the 6.0.9 kernel.
fwiw iāve tried vscodium (afaik chromium based) for around 12hours (some hours in use, mostly idle in background) and no issues with any ecode, which seems to be chrom{e|ium}/app specific based on other responses in this thread.
As you say, there are hints that missing vaapi libs could be related. worth a shot with getting libva setup. When you run vainfo, check for the VLD & EncSlice output lines which indicates vaapi is working.
Iāve received the Framework laptop last week. Iām really happy with it, but Iāve got similar issues as discussed in this topic. I run arch, gnome, kernel 6.0.11. What I also run into is an issue that I donāt see discussed here: when I the nightlight on and off, the top of the display flickers white.
Iāve made a post about it here, mentioning this post, among another. So far Iāve not received a reply.
Do you also see this behaviour regarding nightlight?