I have exactly the same HW (but the 2TB nvme model) and am facing the same issue as you described. FW customer support advised me to do a “mainboard reset” as described here [1].
After resetting the mainboard i had a run for about a whole month with the machine not resetting itself after resume from suspend a single time (usually it takes about 3-6 suspend cycles to trigger the issue here).
A few days ago, it appears my lucky streak came to an, considering the issue, truly abrupt end and the issue started happening again.
I have had this issue since i received the laptop to this day, have tried a lot of different things, and so far, nothing appears to have solved the issue, unfortunately.
You mentioned that the kernel may still have “that platform quirk: setting simple suspend” issue”, and haven’t come across this one.
Have you tried running the system with the “amd-pmf”-module blacklisted as mentioned here [2]?
@sydney: Thanks for the links. I haven’t tried a reset of the motherboard, I will try that after testing the removal the amd-pmf module : seems more promising.
I was investigating kernel updates because the [1] thread seems to end with a promising optimistic conclusion :
“Got it. I installed -hwe 6.8.0-20 and suspend/resume now works every time!”.
What does your ‘‘journalctl -b | grep -i suspend’’ or dmsg tell you after failure to resume ? does it also point to nvme issue ? I’m thinking of trying another ssd to see if this is the WD specific.
Yes, you are correct. That kind of illustrates the desperateness i’m in at this point.
Willing to try everything, even if i don’t think myself it will solve this issue, like blacklisting this module, but you know… how few people know this low level platform management stuff anyways…
On a sucessful cycle it show this:
Oct 14 07:00:19 fw systemd[1]: logrotate.service: Deactivated successfully.
Oct 14 07:00:19 fw systemd[1]: Finished Logrotate Service.
Oct 14 07:04:34 fw rtkit-daemon[1852]: Warning: Reached maximum concurrent process limit for user '1000', denying request.
Oct 14 07:05:27 fw rtkit-daemon[1852]: Warning: Reached maximum concurrent process limit for user '1000', denying request.
Oct 14 07:23:37 fw bluetoothd[1158]: src/profile.c:ext_io_disconnected() Unable to get io data for Hands-Free Voice gateway: getpeername: Transport endpoint is not connected (107)
Oct 14 07:23:38 fw dbus-daemon[1193]: [system] Rejected send message, 0 matched rules; type="method_return", sender=":1.34" (uid=1000 pid=1859 comm="/nix/store/y8rr19f18wq3pccz8rr65r0ksc>
Oct 14 07:23:57 fw systemd-logind[1217]: The system will suspend now!
Oct 14 07:23:57 fw systemd[1]: Starting Pre-Sleep Actions...
Oct 14 07:23:57 fw systemd[1]: pre-sleep.service: Deactivated successfully.
Oct 14 07:23:57 fw systemd[1]: Finished Pre-Sleep Actions.
Oct 14 07:23:57 fw systemd[1]: Reached target Sleep.
Oct 14 07:23:57 fw systemd[1]: Starting System Suspend...
Oct 14 07:23:57 fw systemd-sleep[1410582]: Successfully froze unit 'user.slice'.
Oct 14 07:23:57 fw systemd-sleep[1410582]: Performing sleep operation 'suspend'...
Oct 14 07:23:57 fw kernel: PM: suspend entry (s2idle)
Oct 14 07:23:57 fw kernel: Filesystems sync: 0.001 seconds
Oct 14 15:40:19 fw kernel: Freezing user space processes
Oct 14 15:40:19 fw kernel: Freezing user space processes completed (elapsed 0.002 seconds)
Oct 14 15:40:19 fw kernel: OOM killer disabled.
Oct 14 15:40:19 fw kernel: Freezing remaining freezable tasks
Oct 14 15:40:19 fw kernel: Freezing remaining freezable tasks completed (elapsed 0.599 seconds)
Oct 14 15:40:19 fw kernel: printk: Suspending console(s) (use no_console_suspend to debug)
Oct 14 15:40:19 fw kernel: wlan0: deauthenticating from 04:f0:21:36:61:e3 by local choice (Reason: 3=DEAUTH_LEAVING)
Oct 14 15:40:19 fw kernel: ACPI: EC: interrupt blocked
Oct 14 15:40:19 fw kernel: ACPI: EC: interrupt unblocked
Oct 14 15:40:19 fw kernel: [drm] PCIE GART of 512M enabled (table at 0x00000080FFD00000).
Oct 14 15:40:19 fw kernel: amdgpu 0000:c1:00.0: amdgpu: SMU is resuming...
Oct 14 15:40:19 fw kernel: nvme nvme0: 16/0/0 default/read/poll queues
After it crashed, the last lines from the log always are:
Oct 11 06:28:11 fw systemd[1]: Finished Refresh fwupd metadata and update motd.
Oct 11 06:35:26 fw bluetoothd[1046]: src/profile.c:ext_io_disconnected() Unable to get io data for Hands-Free Voice gateway: getpeername: Transport endpoint is not connected (107)
Oct 11 06:35:26 fw dbus-daemon[1081]: [system] Rejected send message, 0 matched rules; type="method_return", sender=":1.31" (uid=1000 pid=1615 comm="/nix/store/y8rr19f18wq3pccz8rr65r0ksc>
Oct 11 06:35:53 fw systemd-logind[1108]: The system will suspend now!
Oct 11 06:35:53 fw systemd[1]: Starting Pre-Sleep Actions...
Oct 11 06:35:53 fw systemd[1]: pre-sleep.service: Deactivated successfully.
Oct 11 06:35:53 fw systemd[1]: Finished Pre-Sleep Actions.
Oct 11 06:35:53 fw systemd[1]: Reached target Sleep.
Oct 11 06:35:53 fw systemd[1]: Starting System Suspend...
Oct 11 06:35:53 fw systemd-sleep[189140]: Successfully froze unit 'user.slice'.
Oct 11 06:35:53 fw systemd-sleep[189140]: Performing sleep operation 'suspend'...
Oct 11 06:35:53 fw kernel: PM: suspend entry (s2idle)
lines 933-1000/1000 (END)
Note, when the system has crashed, the logs always lack this line before suspend in comparison to a sucessful cycle:
Filesystems sync: 0.001 seconds
I didn’t notice the kernel: nvme 0000:02:00.0: platform quirk: setting simple suspend until you noticed it…
Hi @sydney : it looks there maybe different causes…
I had a week without issue, my last fail to resume gives me this from ‘journalctl’ searching for 'fail
Blockquote
sudo journalctl -b | grep fail
Oct 17 07:38:00 jb-fw kernel: ACPI: _OSC evaluation for CPUs failed, trying _PDC
Oct 17 07:38:00 jb-fw (udev-worker)[534]: nvme0n1: Process ‘/usr/bin/unshare -m /usr/bin/snap auto-import --mount=/dev/nvme0n1’ failed with exit code 1.
Oct 17 07:38:00 jb-fw (udev-worker)[541]: nvme0n1p2: Process ‘/usr/bin/unshare -m /usr/bin/snap auto-import --mount=/dev/nvme0n1p2’ failed with exit code 1.
Oct 17 07:38:00 jb-fw (udev-worker)[534]: nvme0n1p1: Process ‘/usr/bin/unshare -m /usr/bin/snap auto-import --mount=/dev/nvme0n1p1’ failed with exit code 1.
Oct 17 07:38:00 jb-fw (udev-worker)[522]: nvme0n1p3: Process ‘/usr/bin/unshare -m /usr/bin/snap auto-import --mount=/dev/nvme0n1p3’ failed with exit code 1.
Oct 17 07:38:17 jb-fw systemd[1]: Starting grub-initrd-fallback.service - GRUB failed boot detection…
Oct 17 07:38:17 jb-fw bluetoothd[1310]: profiles/sap/server.c:sap_server_register() Sap driver initialization failed.
Oct 17 07:38:17 jb-fw systemd[1]: Finished grub-initrd-fallback.service - GRUB failed boot detection.
Oct 17 07:38:17 jb-fw gnome-remote-de[1316]: Init TPM credentials failed because Failed to initialize transmission interface context: tcti:IO failure, using GKeyFile as fallback
Oct 17 07:38:17 jb-fw boltd[1593]: [d2733804-901e-domain0 ] udev: failed to determine if uid is stable: unknown NHI PCI id ‘0x1668’
Oct 17 07:38:17 jb-fw boltd[1593]: [d2733804-911e-domain1 ] udev: failed to determine if uid is stable: unknown NHI PCI id ‘0x1669’
Oct 17 07:38:19 jb-fw NetworkManager[1462]: [1729165099.0515] failed to open /run/network/ifstate
Oct 17 07:38:20 jb-fw /usr/libexec/gdm-x-session[1767]: xf86EnableIO: failed to enable I/O ports 0000-03ff (Operation not permitted)
Oct 17 07:38:21 jb-fw /usr/libexec/gdm-x-session[1859]: dbus-daemon[1859]: [session uid=128 pid=1859] Activated service ‘org.freedesktop.systemd1’ failed: Process org.freedesktop.systemd1 exited with status 1
Oct 17 07:38:22 jb-fw /usr/libexec/gdm-x-session[1859]: dbus-daemon[1859]: [session uid=128 pid=1859] Activated service ‘org.freedesktop.systemd1’ failed: Process org.freedesktop.systemd1 exited with status 1
Oct 17 07:38:23 jb-fw /usr/libexec/gdm-x-session[1859]: dbus-daemon[1859]: [session uid=128 pid=1859] Activated service ‘org.freedesktop.systemd1’ failed: Process org.freedesktop.systemd1 exited with status 1
Oct 17 07:38:23 jb-fw systemd[1]: Started update-notifier-download.timer - Download data for packages that failed at package install time.
Oct 17 07:38:39 jb-fw systemd-xdg-autostart-generator[2425]: /home/jb/.config/autostart/slack.desktop: stat() failed, ignoring: No such file or directory
Oct 17 07:38:40 jb-fw /usr/libexec/gdm-x-session[2523]: _XSERVTransSocketUNIXCreateListener: …SocketCreateListener() failed
Oct 17 07:38:40 jb-fw /usr/libexec/gdm-x-session[1767]: (EE) AMDGPU(0): failed to set mode: Permission denied
Oct 17 07:38:40 jb-fw gsd-power[2092]: Release of light sensors failed: GDBus.Error:org.freedesktop.DBus.Error.AccessDenied: Not Authorized: Sensor claim not allowed
Oct 17 07:38:40 jb-fw /usr/libexec/gdm-x-session[2523]: xf86EnableIO: failed to enable I/O ports 0000-03ff (Operation not permitted)
Oct 17 07:43:08 jb-fw systemd[1]: Starting update-notifier-download.service - Download data for packages that failed at package install time…
Oct 17 07:43:08 jb-fw systemd[1]: Finished update-notifier-download.service - Download data for packages that failed at package install time.
Oct 17 07:55:18 jb-fw google-chrome.desktop[4772]: [4765:4794:1017/075518.004892:ERROR:connection_factory_impl.cc(483)] ConnectionHandler failed with net error: -2
Oct 17 07:55:18 jb-fw google-chrome.desktop[4772]: [4765:4794:1017/075518.005519:ERROR:connection_factory_impl.cc(483)] ConnectionHandler failed with net error: -2
Oct 17 07:55:18 jb-fw google-chrome.desktop[4772]: [4812:4812:1017/075518.040037:ERROR:gl_surface_presentation_helper.cc(260)] GetVSyncParametersIfAvailable() failed for 1 times!
Oct 17 08:08:34 jb-fw systemd[1]: Starting grub-initrd-fallback.service - GRUB failed boot detection…
Oct 17 08:08:34 jb-fw systemd[1]: Finished grub-initrd-fallback.service - GRUB failed boot detection.
So, not the NVME quirk. I have updated to 6.8.0-47-generic (from 6.8.0-45) following ubuntu suggested updates, I’ll let you know if that seem to change anything ! I am of course out of my depth here
After a few lucky months (of admittedly not using the device much), I have experienced this once more, twice in the same day.
Currently on the 6.11.5-200 kernel.
I’ll probably upgrade to Fedora 41 in a week. Will let y’all know if it stops happening as a result.