Ok, I’ve been putting the system to sleep using the gnome extension. On the second system now and will try that and update here.
Nice,
can you please check which suspend mode your system is using? They say the gnome-extension suspends and then hibernates. That means nobody will care if suspend is s2idle in that case because it will hibernate after a couple of minutes. S2idle is kind of working but has terrible battery drain, which doesn’t matter much once you hibernate the system.
That’s why I am saying the gnome extension is great because it helps configuring hibernation but is not solving the original issue with the deep suspend mode , which was working fine in the older BIOS. I wish I could go back to y 3.17 BIOS and would never touch it again
Tested on the second system:
- Suspend from gnome-menu: wakes ok
- Suspend with power button: wakes ok
- Suspend via command sudo systemctl suspend: wakes ok
My logind.conf file has lid switch and idle action set to suspend-then-hibernate, but I don’t believe that that should apply for these cases.
Do you see anything in the logs when the system goes into the loop, or do they get lost because they have not been written to disk before the loop starts?
For completeness, the two systems have all usb-c or a mix of usb-c cards, sk-hynix drives (gold in one, platinum in the other), one has a mediatek wifi card, the other the intel ax210, one has 64gb ram, the other has 16gb ram, one is on arch, the other manjaro, both are using luks encryption with swap partitions rather than swap files.
What is your slash_proc_slash_cmdline, and what are your system specs? It would be great if we could figure out why my systems have avoided the issue, so that you and others could go back to making use of deep sleep.
Sorry, took a while as I had to de-brick my laptop again.
This is a Fedora 40 with all updates installed. LUKS encryption is used as well.
My slash_proc_slash_cmdline is:
BOOT_IMAGE=(hd0,gpt2)/vmlinuz-6.11.4-201.fc40.x86_64 root=UUID=cb5de268-5529-445a-864a-d8ea07209cc2 ro rootflags=subvol=root rd.luks.uuid=luks-cdb51624-dd59-4f6c-bfab-9811089b2e0f rhgb nvme.noacpi=1 mem_sleep_default=deep acpi_osi=!Windows 2020
I don’t know where this acpi_osi parameter comes from but it is not present on my debian installation on a separate SSD.
lspci:
00:00.0 Host bridge: Intel Corporation 11th Gen Core Processor Host Bridge/DRAM Registers (rev 01)
00:02.0 VGA compatible controller: Intel Corporation TigerLake-LP GT2 [Iris Xe Graphics] (rev 01)
00:04.0 Signal processing controller: Intel Corporation TigerLake-LP Dynamic Tuning Processor Participant (rev 01)
00:06.0 PCI bridge: Intel Corporation 11th Gen Core Processor PCIe Controller (rev 01)
00:07.0 PCI bridge: Intel Corporation Tiger Lake-LP Thunderbolt 4 PCI Express Root Port #0 (rev 01)
00:07.1 PCI bridge: Intel Corporation Tiger Lake-LP Thunderbolt 4 PCI Express Root Port #1 (rev 01)
00:07.2 PCI bridge: Intel Corporation Tiger Lake-LP Thunderbolt 4 PCI Express Root Port #2 (rev 01)
00:07.3 PCI bridge: Intel Corporation Tiger Lake-LP Thunderbolt 4 PCI Express Root Port #3 (rev 01)
00:08.0 System peripheral: Intel Corporation GNA Scoring Accelerator module (rev 01)
00:0a.0 Signal processing controller: Intel Corporation Tigerlake Telemetry Aggregator Driver (rev 01)
00:0d.0 USB controller: Intel Corporation Tiger Lake-LP Thunderbolt 4 USB Controller (rev 01)
00:0d.2 USB controller: Intel Corporation Tiger Lake-LP Thunderbolt 4 NHI #0 (rev 01)
00:0d.3 USB controller: Intel Corporation Tiger Lake-LP Thunderbolt 4 NHI #1 (rev 01)
00:12.0 Serial controller: Intel Corporation Tiger Lake-LP Integrated Sensor Hub (rev 20)
00:14.0 USB controller: Intel Corporation Tiger Lake-LP USB 3.2 Gen 2x1 xHCI Host Controller (rev 20)
00:14.2 RAM memory: Intel Corporation Tiger Lake-LP Shared SRAM (rev 20)
00:15.0 Serial bus controller: Intel Corporation Tiger Lake-LP Serial IO I2C Controller #0 (rev 20)
00:15.1 Serial bus controller: Intel Corporation Tiger Lake-LP Serial IO I2C Controller #1 (rev 20)
00:15.3 Serial bus controller: Intel Corporation Tiger Lake-LP Serial IO I2C Controller #3 (rev 20)
00:16.0 Communication controller: Intel Corporation Tiger Lake-LP Management Engine Interface (rev 20)
00:1d.0 PCI bridge: Intel Corporation Tiger Lake-LP PCI Express Root Port #10 (rev 20)
00:1f.0 ISA bridge: Intel Corporation Tiger Lake-LP LPC Controller (rev 20)
00:1f.3 Audio device: Intel Corporation Tiger Lake-LP Smart Sound Technology Audio Controller (rev 20)
00:1f.4 SMBus: Intel Corporation Tiger Lake-LP SMBus Controller (rev 20)
00:1f.5 Serial bus controller: Intel Corporation Tiger Lake-LP SPI Controller (rev 20)
01:00.0 Non-Volatile memory controller: Kingston Technology Company, Inc. NV1 NVMe SSD SM2263XT (DRAM-less) (rev 03)
aa:00.0 Network controller: Intel Corporation Wi-Fi 6E(802.11ax) AX210/AX1675* 2x2 [Typhoon Peak] (rev 1a)
I am not sure what to look for in the logs. I have pasted the journal over here: https://anonpaste.org/?4d471e0b4d46602a#21vSqTMEtGGKRdQgxw8SmsabbfKeMqv9oWn4Vd274wXo
It says it enters deep and the end right before the time is set back
new paste as there is more weird stuff. It looks like it is trying to wake up but can’t and is falling back to sleep
https://anonpaste.org/?f21e67a03810709a#3tGESkvZVv2aDunWogHFpvi7fNgaunei8nS7BecDVpCD
I will take a look at the log. I apologize for the stupid question, but when this hits, the only recourse is a mainboard reset? Holding the power button for 30+ seconds or doing a cold boot (power off, unplug for two minutes, plug back in, power back on) both don’t work? What about adding the kernel parameter sysrq_always_enabled=1 and trying to issue REISUB when the machine does not come back? Still no dice?
Off to peruse the log now…
Thank you!
The full cycle goes like this:
- I boot with deep enabled via grub cmdline
- In Gnome I just press the powerbutton and the system goes to sleep (powerbutton is flashing)
- I then use the powerbutton again to wake it up
- It tries to come out of sleep (screen flashing once and the light on the powerbutton turns on)
- after maybe 5 seconds - the system turns off again as it seems to go back into sleep for 2 seconds (powerbutton flashing again)
- and then it is trying to wake up again and stays in this cycle forever. At my first encounter I let it in this state for maybe 10 minutes.
- When it is in that state, the fan starts spinning and is slowly ramping up until it is at full speed constantly
- I can press the powerbutton for 30 seconds and the laptop turns fully off
- I have tried letting it stay off for a few minutes
- When I try to start it again it goes immediately into this sleep cycle
- The only way to get it out of that is to remove all batteries (main and RTC) and start the laptop again
- BIOS is going through the full initial cycle with RAM training and then starts booting normally
- This is all happening on battery - there is nothing connected although I have tried it with power connected as well
I also tried a second SSD and installed a fresh Debian stable which showed exactly the same behaviour.
I am not familiar with this sysrq parameter but will try it out now and I am unfamiliar with this “REISUB” thing you mention. Can you maybe point me to a document explaining what that is?
Thanks
That’s awful. The sysrq is a long shot, all I have managed to use it for is to somewhat blindly force a restart if the machine gets into a semi-hung state. Given what you note above I’m not sure that it will help, but it won’t hurt to try. Basically you hold ctrl-alt, hit the printscreen key, release that key while continuing to hold down ctrl-alt, and type reisub - each of those keys actually does a different thing, but it’s getting beyond my skill level. More here: Keyboard shortcuts - ArchWiki
Yeah, as you suspected, the parameter didn’t help. I have additionally removed the acpi_osi parameter, but the effect is the same. No dice!
But thank you again! I have learned something new today!
Nothing stands out to me in the log. I can see the machine going into suspend, the only thing after that is when you have to reset the mainboard and reboot. Still digging.
new logs here: https://anonpaste.org/?49ae11a16203c370#D1zhnsBjvR4Pw8eAQvMhghKnqQ8GuwiRMVMXuhXzPsdu
I had a couple more cycles while it was in sleep this time. What I think is weird is that the clock falls back to Oct. 11 2:00
Why that date? and why does this happen before I remove the battery? Usually systems fall back to something like Jan 1st 1970 when the batteries are removed.
Haven’t gotten to the logs yet. testing a few things.
What is the output of the following:
cat /sys/power/pm_test
cat /sys/power/disk
cat /sys/power/state
$ cat /sys/power/pm_test
[none] core processors platform devices freezer
$ cat /sys/power/disk
[platform] shutdown reboot suspend test_resume
$ cat /sys/power/state
freeze mem disk
Ok. I’m currently hacking through this on one of my 11th gen setups, so we can try to step through it together. I do have to step out in a bit so we may have to pock this back up another day, unfortunately. Basically I am following the steps here: Debugging hibernation and suspend — The Linux Kernel documentation but instead of echoing disk to /sys/power/state I am echoing mem.
Stop me if I am being too step-by-step and you need me to speed up. I am muddling my way through, mind you.
In a terminal su to root
then
mount -t debugfs none /sys/kernel/debug
cat /sys/kernel/debug/suspend_stats
Then step through echoing freezer|devices|platform|processors|core to pm_test
e.g.
echo freezer > /sys/power/pm_test
echo platform > /sys/power/disk
echo mem > /sys/power/state
When the machine comes back (fingers crossed) I am running
cat /sys/kernel/debug/suspend_stats
On one of the test (platorm or processors), the machine took 5 or more minutes to “come back”, so be patient. If the machine does not come back, note what you had echoed into pm_test and we can see what we can figure out. I have gone through this before when testing hibernation, but not suspend. We may have to go through trying the different options for /sys/power/disk. Again, I am a bit out of my depth here but trying to work through it with you. Hopefully we find something.
Well, that was kind of uneventful All test passed!
The result is here: https://anonpaste.org/?60ffb0b2ecdad0a0#CG94MjcnToHtrfYsSr6QNoxyURh6cmYBh5kedqp3FawJ
There are a few open values for /sys/power/state left. I can test them as well, but that has to wait until tomorrow.