Spontaneous and ungraceful restarts upon closing the lid

Which Linux distro are you using?

Arch Linux

Which release version?
(if rolling release without a release version, skip this question)

(If rolling release, last date updated?)

February

Which kernel are you using?

6.13.3-zen1-1-zen

Which BIOS version are you using?

0.0.3.5

Which Framework Laptop 13 model are you using? (AMD Ryzen™ 7040 Series, Intel® Core™ Ultra Series 1, 13th Gen Intel® Core™ , 12th Gen Intel® Core™, 11th Gen Intel® Core™)

AMD Ryzen™ 7040 Series

Hey folks, this weekend, I had a weird situation where I booted up my machine, used it for a bit, then closed the lid. When I came back a short while later, my laptop was on the decrypt screen, indicating it had rebooted sometime between me closing the lid, and me opening the lid again.

I checked the logs, and found some nasty-looking messages from the kernel. The final message after that was an s2idle message.

I was gonna write it off as a coincidence, but it happened again last night, after a fresh install of Arch Linux, no less. I re-installed Arch, this time with the Zen kernel (previously I was using the lts kernel), did some work, then came back an hour or so later to my laptop being in the emergency shell after a failed boot. (I think the fact it failed to boot is unrelated, possibly due to some malarkey I was doing with apparmor)

I checked the Journalctl logs, and found something that looked very similar to the first time it happened over the weekend

Feb 19 20:34:31.356267 willardpad kernel: amdgpu 0000:c1:00.0: [drm] REG_WAIT timeout 1us * 100 tries - dcn31_program_compbuf_size line:141
Feb 19 20:34:41.726233 willardpad kernel: wlan0: deauthenticating from 74:da:88:bc:8e:5e by local choice (Reason: 3=DEAUTH_LEAVING)
Feb 19 20:34:47.343336 willardpad kernel: ------------[ cut here ]------------
Feb 19 20:34:47.343425 willardpad kernel: WARNING: CPU: 2 PID: 17534 at drivers/gpu/drm/amd/amdgpu/../display/dc/dpp/dcn30/dcn30_dpp.c:534 dpp3_deferred_update+0x101/0x330 [amdgpu]
Feb 19 20:34:47.343459 willardpad kernel: Modules linked in: rfcomm algif_hash algif_skcipher af_alg bnep hid_logitech_hidpp snd_usb_audio snd_usbmidi_lib snd_ump snd_rawmidi mc hid_logitech_dj xt_MASQUERADE xt_tcpudp xt_mark tun nf_tables ip6tab
le_nat ip6table_filter ip6_tables iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter cmac ccm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device vfat fat snd_sof_amd_acp70 snd_sof_amd_acp63 snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof
_amd_acp snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_pci_ps snd_soc_acpi_amd_match intel_rapl_msr amd_atl snd_amd_sdw_acpi intel_rapl_common soundwire_amd snd_hda_codec_realtek iwlmvm soundwire_generic_allocation soundwire_bus snd_hda_codec_generic snd_hda_scodec_compo
nent snd_soc_sdca snd_hda_codec_hdmi snd_soc_core mac80211 snd_hda_intel snd_intel_dspcfg snd_compress snd_intel_sdw_acpi ac97_bus hid_sensor_als libarc4 snd_pcm_dmaengine snd_hda_codec kvm_amd hid_sensor_trigger ptp snd_rpl_pci_acp6x cros_usbpd_charger
Feb 19 20:34:47.343519 willardpad kernel:  pps_core snd_acp_pci leds_cros_ec industrialio_triggered_buffer iwlwifi snd_hda_core cros_usbpd_notify gpio_cros_ec cros_ec_chardev led_class_multicolor cros_charge_control cros_ec_hwmon cros_ec_sysfs cr
os_ec_debugfs snd_acp_legacy_common cros_usbpd_logger cros_kbd_led_backlight kfifo_buf mousedev snd_pci_acp6x hid_sensor_iio_common snd_hwdep cros_ec_dev kvm btusb industrialio spd5118 cfg80211 snd_pcm btrtl snd_pci_acp5x sp5100_tco snd_rn_pci_acp3x snd_timer btintel snd_acp_config
 ucsi_acpi btbcm i2c_piix4 snd typec_ucsi snd_soc_acpi btmtk cros_ec_lpcs hid_multitouch hid_sensor_hub joydev cros_ec bluetooth wmi_bmof rapl typec pcspkr amd_pmf thunderbolt soundcore snd_pci_acp3x rfkill k10temp i2c_smbus roles amdtee i2c_hid_acpi amd_sfh i2c_hid platform_profil
e amd_pmc mac_hid pkcs8_key_parser i2c_dev crypto_user loop nfnetlink ip_tables x_tables dm_crypt cbc encrypted_keys trusted asn1_encoder tee dm_mod hid_generic usbhid amdgpu crc16 amdxcp crct10dif_pclmul i2c_algo_bit crc32_pclmul drm_ttm_helper
Feb 19 20:34:47.343547 willardpad kernel:  polyval_clmulni polyval_generic ttm ghash_clmulni_intel serio_raw drm_exec sha512_ssse3 atkbd gpu_sched sha256_ssse3 libps2 drm_suballoc_helper sha1_ssse3 aesni_intel drm_panel_backlight_quirks vivaldi_f
map nvme drm_buddy gf128mul crypto_simd drm_display_helper nvme_core cryptd i8042 ccp video cec nvme_auth serio wmi btrfs blake2b_generic libcrc32c crc32c_generic crc32c_intel xor raid6_pq
Feb 19 20:34:47.343565 willardpad kernel: CPU: 2 UID: 0 PID: 17534 Comm: kworker/u64:16 Tainted: G        W          6.13.3-zen1-1-zen #1 4eb5d478c0b89d357d430d5db88dfd1ce4224b4f
Feb 19 20:34:47.343577 willardpad kernel: Tainted: [W]=WARN
Feb 19 20:34:47.343589 willardpad kernel: Hardware name: Framework Laptop 13 (AMD Ryzen 7040Series)/FRANMDCP07, BIOS 03.05 03/29/2024
Feb 19 20:34:47.343603 willardpad kernel: Workqueue: events_unbound commit_work
Feb 19 20:34:47.343623 willardpad kernel: RIP: 0010:dpp3_deferred_update+0x101/0x330 [amdgpu]
Feb 19 20:34:47.343640 willardpad kernel: Code: 83 78 e1 00 00 0f b6 90 a8 02 00 00 48 8b 83 70 e1 00 00 8b b0 78 04 00 00 e8 5b e9 12 00 8b 74 24 04 85 f6 0f 84 5d 01 00 00 <0f> 0b 0f b6 83 48 96 00 00 83 e0 f7 88 83 48 96 00 00 a8 01 0f 84
Feb 19 20:34:47.343656 willardpad kernel: RSP: 0018:ffffb817205f7a40 EFLAGS: 00010202
Feb 19 20:34:47.343671 willardpad kernel: RAX: 0000000000000066 RBX: ffff98f690900000 RCX: 0000000000000004
Feb 19 20:34:47.343686 willardpad kernel: RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff98f692680000
Feb 19 20:34:47.343702 willardpad kernel: RBP: ffff98f8e4e40000 R08: ffffb817205f7a44 R09: 0000000000000000
Feb 19 20:34:47.343713 willardpad kernel: R10: ffffb817205f79e8 R11: ffffb817205f79ec R12: 0000000000000000
Feb 19 20:34:47.343726 willardpad kernel: R13: ffff98f8e4e402a8 R14: ffff98f8e4e46068 R15: ffff98f82d39de00
Feb 19 20:34:47.343743 willardpad kernel: FS:  0000000000000000(0000) GS:ffff98fdde300000(0000) knlGS:0000000000000000
Feb 19 20:34:47.343757 willardpad kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 19 20:34:47.343785 willardpad kernel: CR2: 00007d1ddcba8c40 CR3: 00000002435f4000 CR4: 0000000000f50ef0
Feb 19 20:34:47.343819 willardpad kernel: PKRU: 55555554
Feb 19 20:34:47.343845 willardpad kernel: Call Trace:
Feb 19 20:34:47.343859 willardpad kernel:  <TASK>
Feb 19 20:34:47.343871 willardpad kernel:  ? dpp3_deferred_update+0x101/0x330 [amdgpu b062dca16a89511cb344336e89d7bea954660181]
Feb 19 20:34:47.343886 willardpad kernel:  ? __warn.cold+0x93/0xed
Feb 19 20:34:47.343896 willardpad kernel:  ? dpp3_deferred_update+0x101/0x330 [amdgpu b062dca16a89511cb344336e89d7bea954660181]
Feb 19 20:34:47.343907 willardpad kernel:  ? report_bug+0xe7/0x210
Feb 19 20:34:47.343919 willardpad kernel:  ? handle_bug+0x58/0x90
Feb 19 20:34:47.343934 willardpad kernel:  ? exc_invalid_op+0x19/0xc0
Feb 19 20:34:47.343946 willardpad kernel:  ? asm_exc_invalid_op+0x1a/0x20
Feb 19 20:34:47.343959 willardpad kernel:  ? dpp3_deferred_update+0x101/0x330 [amdgpu b062dca16a89511cb344336e89d7bea954660181]
Feb 19 20:34:47.343972 willardpad kernel:  dc_post_update_surfaces_to_stream+0x24f/0x470 [amdgpu b062dca16a89511cb344336e89d7bea954660181]
Feb 19 20:34:47.343989 willardpad kernel:  amdgpu_dm_commit_planes+0x13b2/0x2040 [amdgpu b062dca16a89511cb344336e89d7bea954660181]
Feb 19 20:34:47.344003 willardpad kernel:  ? need_active_balance+0x89/0x180
Feb 19 20:34:47.344019 willardpad kernel:  amdgpu_dm_atomic_commit_tail+0x11b2/0x2f70 [amdgpu b062dca16a89511cb344336e89d7bea954660181]
Feb 19 20:34:47.344037 willardpad kernel:  ? __pfx_amdgpu_crtc_get_scanout_position+0x10/0x10 [amdgpu b062dca16a89511cb344336e89d7bea954660181]
Feb 19 20:34:47.344048 willardpad kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Feb 19 20:34:47.344064 willardpad kernel:  ? drm_crtc_vblank_helper_get_vblank_timestamp_internal+0x100/0x3b0
Feb 19 20:34:47.344075 willardpad kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Feb 19 20:34:47.344088 willardpad kernel:  ? dma_fence_default_wait+0x8b/0x240
Feb 19 20:34:47.344107 willardpad kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Feb 19 20:34:47.344118 willardpad kernel:  ? wait_for_completion_timeout+0x130/0x180
Feb 19 20:34:47.344135 willardpad kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Feb 19 20:34:47.344148 willardpad kernel:  commit_tail+0x91/0x130
Feb 19 20:34:47.344156 willardpad kernel:  process_one_work+0x18f/0x350
Feb 19 20:34:47.344169 willardpad kernel:  worker_thread+0x24c/0x380
Feb 19 20:34:47.344186 willardpad kernel:  ? __pfx_worker_thread+0x10/0x10
Feb 19 20:34:47.344203 willardpad kernel:  kthread+0xcf/0x100
Feb 19 20:34:47.344235 willardpad kernel:  ? __pfx_kthread+0x10/0x10
Feb 19 20:34:47.344251 willardpad kernel:  ret_from_fork+0x31/0x50
Feb 19 20:34:47.344266 willardpad kernel:  ? __pfx_kthread+0x10/0x10
Feb 19 20:34:47.344277 willardpad kernel:  ret_from_fork_asm+0x1a/0x30
Feb 19 20:34:47.344288 willardpad kernel:  </TASK>
Feb 19 20:34:47.344302 willardpad kernel: ---[ end trace 0000000000000000 ]---
...
Feb 19 21:11:50.347460 willardpad kernel: amdgpu 0000:c1:00.0: [drm] REG_WAIT timeout 1us * 100 tries - dcn31_program_compbuf_size line:142
...
Feb 19 23:18:22.255309 willardpad kernel: br-f6867570ddb6: port 3(vethaaa934c) entered forwarding state
Feb 19 23:37:59.708233 willardpad kernel: wlan0: deauthenticating from 74:da:88:bc:8e:5e by local choice (Reason: 3=DEAUTH_LEAVING)
Feb 19 23:38:00.041230 willardpad kernel: PM: suspend entry (s2idle)

And that’s it. No more logs. It seems like it just spontaneously and ungracefully restarted after that.

Anyone seen this before? Can anyone offer suggestions or advice for how to debug this? I feel like I can’t trust my laptop not to do this in the middle of a work session (as I was) and lose a bunch of stuff.

I’m attaching the full log file as well, because there is more stuff that happens after this, plus something else happened earlier in the log file that looks similar, so perhaps this is innocuous. Very annoying!

Two is a coincidence, but three is a pattern. Literally just now, I closed the lid to go make coffee, came back, opened the lid, and actually managed to log in and observe my laptop spontaneously restarting. Screen went black, then suddenly I was looking at the framework logo.

Logs are attached.

Re-installed Arch with the stable kernel. Still getting the kernel oops I showed off in the logs. Searching on google revealed this thread from last year:

https://bbs.archlinux.org/viewtopic.php?id=300299

There was conversation of the oops being related to color profiles. I installed the one from the Arch Wiki, and also tried the one linked in the forum, but they had no effect on the oops.

This time around, I followed all troubleshooting instructions on the Arch Wiki. We’ll see if that can keep my system from restarting this way. Framework Laptop 13 (AMD Ryzen 7040 Series) - ArchWiki

One interesting thing to note is that there’s a blurb about wakeup issues that seems similar to the issues I am experiencing.

Some users experience issues on wakeup from suspend / hibernate, as the framework has a chance of rebooting instead of waking up. The wifi module Mediatek MT7921 seems to cause this. As a workaround, turn off wifi before suspend and enable it after wakeup. The following systemd service automates this for both wifi and bluetooth for suspend / hibernate / suspend-then-hibernate:

But, I installed an intel Wi-Fi card:

01:00.0 Network controller: Intel Corporation Wi-Fi 6E(802.11ax) AX210/AX1675* 2x2 [Typhoon Peak] (rev 1a)

Regardless, I created the systemd service, just in case.

This looks like an amdgpu driver bug.
This is best reported here as AMD people read them then.

1 Like

I found one where someone had the same oops as me, but different symptoms: https://gitlab.freedesktop.org/drm/amd/-/issues/3991#note_2794368

This morning, I woke up to find that my system had once again spontaneously restarted. Sometime between Feb 23 18:33 and five minutes ago, something happened that caused it to reboot. Once again, nothing in the logs, so this is confirmed an issue on 6.13.3-arch1-1.

I am thinking that the kernel oops is unrelated to the issue I’m experiencing. No other user seems to be having the same symptoms (the restarts on resume) as me.

The spontaneous restarts might be some separate problem.
It might be worth finding out if the cause is the same as this one:

To determine the similar cause, you would need to apply this patch to the Linux kernel. As described in that thread.
https://marc.info/?l=linux-i2c&m=168089982408414

1 Like

Hey James, I compiled the kernel on Arch Linux with the patch you suggested. Quick sanity check, what exactly am I looking for here? Am I supposed to see an S5_RESET_STATUS in my dmesg every boot?

Yes, S5_RESET_STATUS should appear in the dmesg output after each boot.

1 Like

Couldn’t get it working before work, will try again tonight. For some reason the patch just isn’t applying. It looks like maybe the file has changed since it was authored? The second function no longer appears to be static. (I’m looking at kernel 6.13.5 btw)

And by the way let me say thank you so much for putting so much time and effort into this issue, I definitely owe you a beer.

Look at the github issue.
Someone has noticed that the patch might not apply, and has explained how to correct it.

2 Likes

Got it! After installing and cleanly rebooting, the most recent code is this one (which is probably normal):
[ 26.507883] S5_RESET_STATUS = 0x00080800

Gonna try to trigger the hard reset and see what the code becomes.