[TRACKING] Freezes on Newest Linux Kernels

I’m having freezing issues on OpenSuSE tumbleweed with kwin. Can’t even kill -9 it. It came about after 6.1.8. I have no problems with 6.1.8, but 10 and 12 are no good for me.

2 Likes

As a former OpenSuSE Leap fan, tumbleweed is exciting and frustrating some days. If you have Snapper setup, perhaps a rollback is in order to see if you get some relief until this is resolved?

1 Like

Thanks @Matt_Hartley,

I’ve set zypper up to keep several old kernels, and to keep 6.1.8, so it’s working for me. I’m just shocked at how I seem to be able to find almost nobody else with issues like this. Makes me think it’s a combination of things that only a few of us are experiencing, maybe due to hardware and/or bios/firmware uniqueness.

My laptop, an ASUS Rog Strix with an optimus 3070Ti, that I generally keep disabled in Linux, for better battery life, has a 360Hz LED panel that was also restricted to 60Hz up to 6.1.8.

Whatever changes came about in 6.1.9 gave me access to the higher refresh rate of 360Hz, which is nice, but brought the freezing with it, which isn’t so nice.

2 Likes

Glad this workaround helped. May be worth waiting to see if this is ironed out in the near future in updates.

1 Like

Some lines from the last freeze I had under under the new mainline kernel.

Feb 26 13:12:26 Sapientia rtkit-daemon[1387]: Supervising 7 threads of 5 processes of 1 users.
Feb 26 13:12:26 Sapientia rtkit-daemon[1387]: Supervising 7 threads of 5 processes of 1 users.
Feb 26 13:12:44 Sapientia rtkit-daemon[1387]: Supervising 7 threads of 5 processes of 1 users.
Feb 26 13:12:44 Sapientia rtkit-daemon[1387]: Supervising 7 threads of 5 processes of 1 users.
Feb 26 13:13:44 Sapientia rtkit-daemon[1387]: Supervising 7 threads of 5 processes of 1 users.
Feb 26 13:13:44 Sapientia rtkit-daemon[1387]: Supervising 7 threads of 5 processes of 1 users.
Feb 26 13:14:44 Sapientia rtkit-daemon[1387]: Supervising 7 threads of 5 processes of 1 users.
Feb 26 13:14:44 Sapientia rtkit-daemon[1387]: Supervising 7 threads of 5 processes of 1 users.
Feb 26 13:15:11 Sapientia smbnetfs[441635]: smbXcli_negprot_smb1_done: No compatible protocol selected by server.
Feb 26 13:15:11 Sapientia smbnetfs[441635]: smbXcli_negprot_smb1_done: No compatible protocol selected by server.
Feb 26 13:15:46 Sapientia rtkit-daemon[1387]: Supervising 7 threads of 5 processes of 1 users.
Feb 26 13:15:46 Sapientia rtkit-daemon[1387]: Supervising 7 threads of 5 processes of 1 users.
Feb 26 13:16:14 Sapientia /usr/lib/gdm-wayland-session[1744]: 03:09:20.384 [ERROR] [wlr] [xwayland/xwm.c:1522] xcb error: op ChangeProperty (no minor), code Window (no extension), sequence 15466, value 8388645
Feb 26 13:16:14 Sapientia /usr/lib/gdm-wayland-session[1744]: 03:09:20.384 [ERROR] [wlr] [xwayland/xwm.c:1522] xcb error: op ChangeProperty (no minor), code Window (no extension), sequence 15467, value 8388645
Feb 26 13:16:19 Sapientia /usr/lib/gdm-wayland-session[1744]: 03:09:25.491 [ERROR] [wlr] [backend/drm/atomic.c:72] connector DP-1: Atomic commit failed: Device or resource busy
Feb 26 13:16:21 Sapientia /usr/lib/gdm-wayland-session[1744]: 03:09:26.958 [ERROR] [wlr] [backend/drm/atomic.c:72] connector DP-1: Atomic commit failed: Device or resource busy

The ones that look unique to this particular moment are the xwayland errors. The other log events seem pretty regular.

1 Like

More information. I can reliably engineer by starting Skyrim through steam. The whole system will freeze and require a hard reboot. There are no errors in the logs that indicate why this freeze has occurred.

Hey folks, im willing to submit logs of my framework before the freezes. I can confirm that this happens to me as well on a 12th gen 1240p (i5) it happened on both 6.1 and 6.2 kernels; tried both Fedora, now im on tumbleweed, and yes steam is a big risk factor for the PC Freezing. Though, i have noticed that when i press and hold the power just for a little bit (for suspend) it does suspend and when i bring it back from suspend then its good again. I’m on a wayland session BTW, and also, yes it does happen when there is some type of load i believe on the GPU side of things. When the laptop freezes, the fans go real loud and the temps as well ~80c+

I am having probably the same issue running Manjaro Linux running Plasma on X11 - I have updated to the newest experimental Kernel 6.2.0rc8-1. I thought the issue gone but its showing up now again.

Sometimes everything locks up but most of the time the mouse keeps working while nothing reacts. Afterwards Plasma is partially non-functional. This is the only Message that is generated when the Hang happens:

[ 8167.119031] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:0:00000000

So far it always seems like was triggered by SSD access or WLAN communication. I had no issues with some light gaming at all.

1 Like

it just happened again really bad just now. some strange system load load but the fan is going fairly high tilt. The case is hot It seems that plasmashell instances are half-pegging a core each.

[ 9037.701655] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:0:00000000
[ 9037.701835] i915 0000:00:02.0: [drm] Resetting chip for no heartbeat on rcs0
[ 9037.803877] i915 0000:00:02.0: [drm] GuC firmware i915/adlp_guc_70.bin version 70.5.1
[ 9037.803880] i915 0000:00:02.0: [drm] HuC firmware i915/tgl_huc.bin version 7.9.3
[ 9037.820553] i915 0000:00:02.0: [drm] HuC authenticated
[ 9037.821571] i915 0000:00:02.0: [drm] GuC submission enabled
[ 9037.821572] i915 0000:00:02.0: [drm] GuC SLPC enabled

it got into really bad state and can hardly type or read the browser. Going to post this then reboot.

1 Like

So I went and disabled the entire power management feature of the i915 drivers to see if that would work. Loaded up a game in Steam, frozen within a minute. With a log full of [ERROR] [wlr] [backend/drm/atomic.c:72] connector DP-1: Atomic commit failed: Device or resource busy.

So yeah, I am running out of ideas for what is causing these issues.

1 Like

So, I decided to run a Memtest. After 18 minutes, the test froze reporting 78 errors. Currently running another memtest with only 1 dimm in channel 1.

I am wondering if there might be an issue with using 64gb of Ram on the beta 12th gen bios. I will test each individual dimm module and report back whether errors occur on either stick.

So just to update mid-test, with only one dimm slot populated, the CPU temps have been reduced by 20 degrees. Which on the face of it looks pretty strange.

So, testing the second module in the first slot, again the temps are down by quite a lot, down to 75-80C from 100C with both slots populated. I am going to test if the freezes while doing a game occur. If they don’t, I will switch the ram module to the second slot and see if the issue is the second slot, or whether it is just when there are two ram modules installed.

So, there is very obviously something going on with the memory. When I tried to run the programs that would cause a freeze with only one populated ram slot, I had no freeze. When I put both modules back in, after testing each individual slot with both dimms in Memtest86, I had a freeze quite rapidly.

If both sticks passed the memtest chances are you have a bad slot on the motherboard. I would defintiely reach out to support.