Linux FW16 still will still not power on after suspend (Oct 2024)

I’m still having the long-standing issue where my FW16-Batch 6 7840HS no GPU system will not power on after suspend. The power button blinks but pressing it has no action, and the only way to recover is to hold power button for several seconds until system powers off.

I posted about this issue back in May when the machine was new(er): Linux FW16 will not power on after suspend. I was hoping new OS updates (I’m running OpenSuse Tumbleweed) or BIOS updates (3.03 is still current) would fix it it, but alas, I’m still unable to resume from suspend. I’m really getting tired of restarting everytime I need to move locations, now multiple times a day.

For the record (all contained in original thread):

  • Suspend/resume works in windows.
  • I’ve tried multiple Linux distros, all have same issue. Specifically tried Ubuntu 24.04 LTS as documented.
  • Checked SSD firmware (running WD_BLACK SN850X 4000GB) it’s current version.
  • Tried another SSD (a 2gb WD_BLAC)
  • Tried swapping DIMMS (2x32gb), 1 each of dimm in each socket.
  • Tried resetting BIOS to default
  • Tried changing BIOS PCIe Dynamic Link Power Management setting
  • Tried running amd_s2idle.py script (see results in original thread)

I’m hoping there is some new insight or idea. Can anyone confirm they have a working system with similar config - Any FW16 with Any Linux OS that suspend/resume reliably?

Hi,

I read that other thread.
I have a FW16, 64GB RAM, SN850X 4000GB.
Suspend works 100% fine each time on my system, so the only difference will be the kernel version being used, the firmware being used and the configuration you have.
To make any progress, start by a full mainboard reset and leave the laptop powered off over night. Not set to suspend, actually powered off. (The EC will properly reset then)
Then switch on the next day. (The delay between switch off/on is to force the EC reset)
Suspend wake-up depends on the EC working correctly.
Next post the entire output from “dmesg” to pastebin.com, google drive or similar. Not a small part of it, the entire “dmesg”.
Next try the amd_s2idle.py script and post the output it writes to a log file to pastebin.com, google drive or similar.
We can then have a look, and find out why your system config differs from everyone else’s that works.

Thanks for the response! Just to verify on EC reset, is it anything else other than power-off overnight? I generally already leave it powered off overnight. Anyway to verify it has reset? I’ll work on the dmesg and amd_s2idle, thanks again!

I have a fairly similar configuration (FW16, batch 5, 7840HS, no GPU, 2 NVMes [Samsung 990 PRO, WD SN770M]) and, although the situation seemed to have improved recently (I have spent several months without a hiccup: [RESPONDED] Waking from suspend w/ lid closed - #61 by Xavier_G), my FW16 failed to wake up from suspend this morning.
Like @Jason_Rivard, I am interested in any way to inspect the state of the embedded controller.

As requested: dmesg:

amd_s2idle:

Additional babbel: I use virtualbox on this machine, and amd_s2idle complains about that in the form of tainted kernel warnings. So I uninstalled virtualbox prior to capturing these outputs. The amd_s2idle actually worked! Then suspending manually (via kde menu) also worked! Then it borked again and refused to wake up. On the next reboot it failed on the first suspend, and then failed again on the next attempt. so I have no idea if this means anything at all.

Thanks again for any insight!

It sounds like virtualbox is causing some suspend problems for you.
I don’t use virtualbox, so I cannot help you there.
I do use virt-manager with kvm/qemu and it works find with suspend.

Try to ssh into the laptop, and the try suspend cycles. If one fails, see if the ssh session still works. Probably best to use an ethernet cable, rather than wifi for the test. Less complexity involved.
Also see if the s2idle script managed to write the log file when it fails.
My guess is that it is coming out of suspend, but the amd gpu driver is failing to work so you have no display output.
Anyway, try various tests, to try to discount the GPU as the problem.

Unfortunately the nice log output you posted did not really help, because it was from a successful s2idle cycle. We need it from a failed s2idle cycle.

At least you have seen it work once or twice, so we know it can work.

I’m pretty sure virtualbox is not the issue, as it still happens with virtualbox removed. Sorry I wasn’t very clear in my post. Having virtualbox installed seems to make it more likely to fail, but it fails with or without virtualbox. Again, it also fails exactly the same way on ubuntu 24.10 with no additional services installed or configured post-install.

With virutalbox still uninstalled, I ran the s2idle script again and set it to do 10 cycles. After cycle 6, it failed. Here’s the log: s2idle_report-2024-10-12-2.txt · GitHub

When it fails, the power button blinks, but pressing it does nothing. It continues to blink and the display does not power on (nor any other part of the laptop). Because the power button continues to blink I don’t think its getting as far as the GPU driver because of this, but I really don’t know. I’ll try the SSH test next time I have ethernet and a second machine to test with.

Thanks again for your insights!

I am looking for differences between yours and my FW16.
My tests are done without WiFi or Bluetooth.
Try with those both switched off. e.g. airplane more.

Can you do “nvme list” and post the output here?
It should list the nvme devices in your laptop and also their firmware versions.

Sure! Here’s the nvme list:
Node Generic SN Model Namespace Usage Format FW Rev


/dev/nvme0n1 /dev/ng0n1 23446H800158 WD_BLACK SN850X 4000GB 0x1 4.00 TB / 4.00 TB 512 B + 0 B 624361WD

I’ll do some testing with airplane mode enabled.

Hi. 62436WD is the same firmware as I have, and it works find with suspend waking here.
So the problem is elsewhere.

Hi,

Looking at the logs more closely, I think your problems with suspend are hardware fault related. I think the EC hardware is faulty and not functioning the same as my FW16.
Note: I am not anything to do with FW, apart from also owning a FW16.
I suggest you raise a problem with FW support and get a hardware replacement.
I suspect a replacement mainboard will help.

Looking at the logs more closely, I think your problems with suspend are hardware fault related.

Can you please share what part(s) of the logs stand out to you?

I am curious to see if there is anything that corresponds to my own (albeit more rare) problems with suspend/resume.

Thanks,
Corey

Another option other than dmesg might be the journalctl logs.

Can you take note of a time right before you suspend your laptop, and once it happens again, restart the laptop and do:

# journalctl --since "14:10:10" > journal_logs.txt

Changing 14:10:10 for the time right before you suspended the system. Upload that output to your git repo.

EDIT: The above may fail. In your dmesg logs, there are errors about journal corruption. You can check this with:

# journalctl --verify

Then, remove the corrupted journal files from the path.

I will add, I had this issue MANY times with Ubuntu starting in 20.04. It affected my Dell and System76 laptops. (I have not tested w/Open SUSE).

I am curious - you said you tested on many different distros, have you tested on Fedora?

Also, would you mind sharing an RPM listing from the OS? Open SUSE I think uses the same as other EL based distros:

# rpm -qa > rpms.txt

My thoughts there might be that a specific app is causing issues (similar to the Virtual Box theory… btw… use virt-manager its much better :slight_smile: )