Still got a hard freeze then recovery when trying
enable_psr=0
@Aggraxis how do I undo your solution if I want? Delete the i915.conf file and reboot? Or do I need to run more commands e.g. something with dracut?
Still got a hard freeze then recovery when trying
enable_psr=0
@Aggraxis how do I undo your solution if I want? Delete the i915.conf file and reboot? Or do I need to run more commands e.g. something with dracut?
This is what I get when running vainfo
:
Trying display: wayland
libva info: VA-API version 1.16.0
libva info: Trying to open /usr/lib64/dri/iHD_drv_video.so
libva info: va_openDriver() returns -1
libva info: Trying to open /usr/lib64/dri/i965_drv_video.so
libva info: Found init function __vaDriverInit_1_15
libva error: /usr/lib64/dri/i965_drv_video.so init failed
libva info: va_openDriver() returns -1
vaInitialize failed with error code -1 (unknown libva error),exit
Seems like libva isnât configured correctly as it canât even init?
Edit: and this is after installing libva
, libva-utils
, libva-intel-driver
, ffmpeg
, which werenât installed by default and therefore I couldnât run vainfo
Edit2: after removing libva-intel-driver
and insalling intel-media-driver
, vainfo
seems to complete successfully:
Trying display: wayland
libva info: VA-API version 1.16.0
libva info: Trying to open /usr/lib64/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_1_16
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.16 (libva 2.16.0)
vainfo: Driver version: Intel iHD driver for Intel(R) Gen Graphics - 22.5.4 ()
vainfo: Supported profile and entrypoints
VAProfileNone : VAEntrypointVideoProc
...
In firefox in about:config check if media.ffmpeg.vaapi.enabled is set to true. Close firefox, reopen and see if tha tmakes a difference. If that does not work go back to about:config and set gfx.webrender.all to true.
@Kelby_Faessler
You would either comment out the line in the i915.conf or remove the file, then run sudo dracut --force again, followed by a reboot.
By the way, for anyone looking for more data, I upgraded to F37 without incident some time ago. My i915.conf changes are still in place.
Happened again. Fedora 37, Linux fedora 6.0.11-300.fc37.x86_64 and I was in gnome-settings.
Tail of journalctl log below. What else should I look for when data collecting?
Dec 09 23:40:57 fedora kernel: Asynchronous wait on fence 0000:00:02.0:gnome-shell[1973]:15542 timed out (hint:intel_atomic_commit_ready [i915])
Dec 09 23:41:01 fedora kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 12:1:84dffffb, in gnome-control-c [28845]
Dec 09 23:41:01 fedora kernel: i915 0000:00:02.0: [drm] Resetting chip for stopped heartbeat on rcs0
Dec 09 23:41:01 fedora kernel: i915 0000:00:02.0: [drm] ERROR rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
Dec 09 23:41:01 fedora kernel: i915 0000:00:02.0: [drm] ERROR rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
Dec 09 23:41:01 fedora kernel: i915 0000:00:02.0: [drm] gnome-control-c[28845] context reset due to GPU hang
Dec 09 23:41:01 fedora kernel: i915 0000:00:02.0: [drm] brave[4303] context reset due to GPU hang
Dec 09 23:41:01 fedora kernel: i915 0000:00:02.0: [drm] GuC firmware i915/adlp_guc_70.1.1.bin version 70.1
Dec 09 23:41:01 fedora kernel: i915 0000:00:02.0: [drm] HuC firmware i915/tgl_huc_7.9.3.bin version 7.9
Dec 09 23:41:01 fedora kernel: i915 0000:00:02.0: [drm] HuC authenticated
Dec 09 23:41:01 fedora kernel: i915 0000:00:02.0: [drm] GuC submission enabled
Dec 09 23:41:01 fedora kernel: i915 0000:00:02.0: [drm] GuC SLPC enabled
@Elmo it looks like there was a gpu hang on the brave browser process. That doesnât mean brave was the culrit, but thatâs where it happened.
I havenât been gaming on my laptop in a while, and today after an update it looks like Iâm having GPU hangs again - even with the psr=0 parameter. Whatâs odd is that when using Wayland I can get the display to hang within about 2 minutes just by running circles around in a circle. With X11, although it was running noticeably slower, the system kept running for much longer without hanging up. Iâm still digging, but it looks like maybe either Intel or the Wayland folks have something freaky going on.
edit - I havenât been seeing these hangs under normal use, which for me means some browsing, a horizon client session, and some terminal work.
Man, this makes my insides hurt. So we can see that the kernel module gets loaded, gets its firmware, does some happy normal stuff. I fire off my game, run about 6 circles around the area Iâm in, and then VRRRP frozen. This last time it gave me one little burp of frames at the end, and I assume thatâs what happened at that spot where the module seems to have been reloaded.
Anyhow, this is all on F37 w/ kernel 6.0.12-300.fc37. Going to be fun teasing this out.
Small update on this. Despite running Kernel 6.0.9, I encountered eleven freezes (!) with ecode 12:0:00000000
and one with ecode 12:1:0036abdf
since Dec 10th. Iâll update to 6.1 in the coming days now that itâs released. I might also try using xorg instead of Wayland like @Aggraxis suggested.
I also thought it might be related to Ubuntuâs Power Mode (esp. Power Saving) but some freezes occurred in the default Balanced mode.
@KevSlashNull In playing with things yesterday I found that I was still able to alt-f3 over to another shell like the previous flavor of freezes. I was also able to get the system responsive again by finding the wayland session process (ps -eaf | grep wayland), killing it, and then in my case restarting the sddm process (systemctl restart sddm). Gnome users will need to restart gdm (systemctl restart gdm) instead.
Whatâs driving me nuts with this round of freezes is the inconsistency. I can run the aquarium from webglsamples.org with 25,000 fish at 60 fps for more an an hour with no freeze. Firing up FFXIV via xivlauncher and playing the game for not even a minute results in a freeze. AND JUST THE DISPLAY. Iâm certain the game is otherwise happily running. The music is definitely playing.
This happens on battery, on AC power, in a house with a mouse, etc.
I took all of the kernel module tweaks out, but nothing changed. I even disabled all of the power saving features in the driver, still no result. I havenât gotten anything else to barf out more info that anyoneâs going to find useful. Still digging. Somehow between all of us out there weâll figure this out.
Ok. I need to do a lot more testing, but for my specific freeze I think I found at least a partial answer:
https://wiki.archlinux.org/title/intel_graphics#Enable_GuC_/_HuC_firmware_loading
Based on that page, it says that that new in Gen12 is using the GuC for âscheduling, context submission, and power management.â
Hmmmmm⌠Ok, so just for giggles I added options i915 enable_guc = 0, which fully disables the firmware loading. Yes, this probably messes with video playback acceleration, but hereâs the kicker:
FFXIV ran for 3 hours straight, and not just me running circles around an Aetheryte. I went all over the place, mined some weird stuff, and put the system through its paces (fan whirring on max the whole time). It ran great,
I commented the line out, re-ran dracut again, and on next boot the game crashed in 47 seconds. So yeah, I need to run more trails, and also check to see if options 1 or 2 make a difference. (I suspect option 2 will work fine, and that itâs 1 or 3 where it loads the GuC Submission that it will go bonkers, but I need to fiddle with it.)
Anyways, if you happen to try that out, let me know how it goes. More info is better.
Ok so far 1 crashed pretty quickly, and 3 (the default) also crashes. 0 ran well, and 2 is running fine right now. Going to abuse it a bit.
dang I tried enable_guc=0 and enable_guc=2 and I still get freezes when I reopen chrome tabs.
Weird. I think the only other changes I made to my system were steps 1 and 2 out of this article:
And really, I had RPMFusion set up already at that point. Step 2, for those who donât want to read that other thread, is this set of package installs:
sudo dnf groupinstall multimedia
sudo dnf install intel-media-driver libva libva-utils gstreamer1-vaapi ffmpeg intel-gpu-tools mesa-dri-drivers mpv
Spoke with my contact with Fedora (works for Red Hat).
He indicated that the freezing is a known issue that has proven difficult to replicate with Fedora and other distros as well. So weâre working on it, but at this point, the best thing we can do is:
Sorry not a framework guy â but similar situation: f37 (6.0.13-300.fc37.x86_64) and 12th Gen Intel(R) Core⢠i7-1260P.
I am currently experiencing some relief with boot param:
intel_idle.max_cstate=1 and intel_idle.max_cstate=2
Last night I got like 2 un-interrupted hours of use w/1 . Laptop fan ran the whole time.
Right now, w/2, stable for like 20 mins. Laptop fan cycles intermittently.
Prior to adding this (or nomodeset) I could hardly boot.
max_cstate can go to 3 or 4 maybe more. Has to do with power savings during idleness.
This suxâŚbut HTHâŚ
Changing cstate is a good idea, but based on what I am told from a representative of the Fedora project, it has affected other distros and has been difficult to replicate. Iâm on a 12th Gen Framework, on Fedora 37, latest kernel - zero freezing. But I also have nothing attached.
Fedora dumped a bunch of f37 updates in last 12-24 hours. I got about 34 updates â lots of firmware packages. Noteworthy: kernel-6.0.14-300.fc37 and intel-gpu-firmware-20221214-145.fc37 and intel-gpu-firmware-20221109-144.fc37.
Made no real changes for me. After a fresh boot, no switches, usually with 5 min after gdm login hang, with lots of trash on my screen. This happened after these updates, too.
Rebooting with max_cstate=1 and max_cstate=2, AFAIK, is stable. max_cstate=3, had a lock up (but not graphics artifacts on screen) after about an hour or so. This is primarily browsing with chrome (google variant); some www/youtube videos, really not much else â machine too unreliable to work on at this point.
I did, for a work day, have this rig using HDMI monitor and then a thunderbolt hub with a 2nd HDMI driving another monitor, as well as ethernet rolling. This surprisingly worked out really well, for like 8 hours â primarily local browsing and then a VPN to my corporate with an RDC session.
I donât know if some poison came in some dnf update â but hasnât been stable since 12/19 and was only moderately stable since inception of this laptop on 12/15.
(will review thread to see how you/others got to some stability â maybe I missed something; the cstate thing was picked up from some other thread out on the WWW )
HTH someone
If itâs freezing while attached to something (hub, monitor, etc), Iâd start by testing stability without those things. Then if itâs stable, we know that updates may have hosed something in terms of the extras attached.
Checking dmesg when possible or even better if itâs completely freezes, checking the journalctl.
But if a distro is unstable, begin stripping things away as to identify the trigger point (even if caused by an update). Itâs a good starting point in conjunction to checking the journalctl after freezing.
As it sits now, there should be no reason to add cstate parameters for stability. When a specific kernel breaks something, sure, but otherwise if itâs needed, itâs time to revisit a previously working kernel.
I will run updates tonight to catch my Framework up to the latest. See if I can replicate it. I will not be connecting to a dock or display because I want to emulate Framework issues first, vs attached device compatibility issues. Thatâs after I establish the laptop is golden, first.
Agreed, actually except for the 1 day of âsuccessâ, with hub, monitors, etc, I usually run the laptop with nothing plugged in. Sigh, almost seemed more stable with the hub and stuff â but not enough for a daily driver.
FWIW: last dnf update (230 PM est 12/23/2022) contained, among other things:
Upgrade xorg-x11-drv-intel-2.99.917-54.20210115.fc37.x86_64 @updates
Upgraded xorg-x11-drv-intel-2.99.917-53.20200205.fc37.x86_64 @@System
I applied that and after a naked boot (just laptop, no peripherals), almost immediate hang/crudded-up screen. I added back in intel_idle.max_cstate=2; still seems to ok for me at the moment, no enable_psr=0, no Huc/Guc tuning.
HTH