Spontaneous and ungraceful restarts upon closing the lid

Update for this thread.

I installed Fedora 41 KDE (not a live disk, installed) on a USB stick that I am booting from. It has power profiles daemon set to balanced. I have not encountered a reboot event in a few days of sleep/wake.

A few days later, I noticed there was amd-ucode updates on arch, so I installed those, and have not experienced another reboot since (fingers crossed). My PPD is also set to balanced here.
I am now running 6.13.7-arch1-1
From pacman logs
[2025-03-19T18:45:23-0400] [ALPM] upgraded amd-ucode (20250311.b69d4b74-2 → 20250311.b69d4b74-3)
[2025-03-19T18:45:23-0400] [ALPM] upgraded linux-firmware-whence (20250311.b69d4b74-2 → 20250311.b69d4b74-3)
[2025-03-19T18:45:24-0400] [ALPM] upgraded linux-firmware (20250311.b69d4b74-2 → 20250311.b69d4b74-3)

I have been dailying my Framework again for a few weeks now, and no reboots have happened (fingers crossed).

2 Likes

I get the restart and 0x08000800.

6.15.1-gentoo-x86_64
AMD Ryzen AI 9 HX 370 w/ Radeon 890M
WCN785x wifi
linux-firmware-20250410
bios-version: 03.03

Wow, this bug is persistent. Can you open a ticket with Framework about this? If I can repro I’ll do the same. Please encourage everyone else to do the same.

Might need to look at what changed between my current kernel and this new one. I am not having this yet on 6.14.9

Wait, your error (0x08000800 sync flood) was different from what I was getting ( 0x00800800 triple fault). Interesting…

Have a look here too: FRWK16 - Random Crash then Reboots - #53 by James3

DEFINITELY open a ticket with Framework.

CC @Matt_Hartley it appears the Ryzen AI 300 series boards also suffer from the Sync Flood issue. Please tell me that Framework engineers are aware of this and actively looking into it? This means every AMD board Framework has released suffers from this problem.

Thanks for tagging me, @Will_Nilges

Which is good, as Fedora is tested and validated to work (baring regressions of course).

Sounds like the latter happened on Arch, and you are no longer seeing the issue on Arch as well?

@Jesse_Darnley can you spearhead making sure both Fedora and Arch (fully updated) are playing nicely? Other details in this thread as well.

Hey Matt, thanks for the quick response :smiley:

Yep, sometime around 6.13.7, one of those upgrades seemed to solve my Triple Fault issue. I have not had this problem since. I am currently running linux 6.14.10-arch1-1, core/linux-firmware 20250508.788aadc8-2, and core/amd-ucode 20250508.788aadc8-2.

However, I got tagged in this post the other day. I asked if they could check the S5_RESET_STATUS so we can be sure if it’s FTR or Triple Fault but they haven’t replied. I am guessing it is Sync Flood because of the version of Linux (6.14.9) they are running. I haven’t seen anybody overtly reporting the Triple Fault 0x00800800 yet, but the Ryzen AI 300 series boards are confirmed to be experiencing the Sync Flood 0x08000800.

I appreciate the attention on this. This Sync Flood issue is particularly nasty because it seems to affect any and all boards on any and all distros/kernels. This thread, as well as the others replied in, and the GitHub issues all have these reports. It seems like it’s hardware/firmware related. I fear that if it is not root caused by Framework engineering, whatever AMD board is released next will also have this issue. I’d suggest this thread be read and watched as the main Sync Flood thread.

There’s a lot of good info from people debugging here, as mentioned by sydney in the other thread: FW16 Freeze then Reboot (FTR) S5_RESET_STATUS = 0x08000800 <- Sync Flood. Ā· Issue #41 Ā· FrameworkComputer/SoftwareFirmwareIssueTracker Ā· GitHub

This thread is about the Triple Fault, which seems to be resolved, but I won’t be certain until I get to 6.15 and use it for a few weeks.

1 Like

Howdy guys,

Just like Matt mentioned, this issue is going to be my first focus of the day. Currently putting a fresh install of Arch on a Framework Laptop 13 Ryzen AI 300 series. Thanks a ton for sharing the GitHub issue and related threads, I’m gonna be pouring over those while I work on this.

I do share the sentiment most are sharing that it will still take a considerable amount of time to really narrow down the issue and give our engineers and AMD the information they need to act on this. We really appreciate everybody’s patience in the meantime. I hope we’re able to provide an update soon.

3 Likes

Good news everyone!

I’ve finally found a way to force this crash to happen essentially on demand with my hardware. I’m hoping we’re able to use that as a basis to find a fix for this issue, and with luck I’ll be able to deliver for information on this soon.

4 Likes

@Jesse_Darnley
Please do elaborate.
Which S5_RESET_STATUS are you able to reproduce?
I would like to be able to reproduce it on demand also.

2 Likes

@Jesse_Darnley , any updates on this? This issue has been quite a thorn in our side.

Unfortunately I don’t come bearing the best of news. We’ve been working in-house and with manufacturing partners trying to find consistent and reliable replication methods and haven’t been unsuccessful. The method I’d been leveraging to consistently crash my own AI 300 series Framework Laptop 13 was power adapter related and has been fixed with subsequent firmware updates.

We have gotten a couple of reports of the same sort of issue from Windows users, and I’ve been having some back and forths with the firmware team about it. We’re doing everything we reasonably can and narrowing down the variables as much as we can. I still can’t provide an ETA, but work is ongoing.

2 Likes

I have a theory on a possible cause.

After a timeout period, the psu will drop to 5V while in standby.

When powering on, it will switch back to 20V (36V for FW16). During the switch it forces the psu to 2.5 Watts. This switch causes the resume to fail.

I think the solution is for the EC to complete its switch to 20V before atemping to resume.

A good test for this would be:

  1. resume with psu plugged in fails

  2. resume with no psu attached works.

Can anyone seeing the problem try these two tests?

The correct solution though is a hardware change, placing larger capacitors in the power chain on the laptop side so they can store more power to absorb the short power glitches from the psu and cpu.

I have seen the psu glitch down to 2.5W and the CPU glitch up to 450W. These glitches are very short, e.g. 1-10ms

1 Like

Hey, I might be having the same, or at least a very similar issue. I don’t get freezes or weird graphical glitches, the first thing I see after opening the lid is a black screen and the laptop starting to boot. I’m using a FW13 AMD Ryzen 7040, currently running Arch Linux with kernel version 6.16.3-zen1-1-zen

Can you post the ouput from journalctl --grep 0xCF9, or just journalctl -b0 --grep 0xCF9 at the next boot after this happens, please?

edit: perhaps as superuser.. i don’t know your setup.

It just happened, but sorry, neither of those commands returned any output. The logs end at kernel: PM: suspend entry (s2idle)

That’s odd. I don’t know whether the zen-kernels don’t carry this [1] patch, or what’s going on. You can read more about it here [2].

N.B: the value of this register will of course only be available upon reboot, i.e. not after a power cycle.

This is what it’s supposed to look like.

Aug 27 03:28:45 fw kernel: x86/amd: Previous system reset reason [0x00080800]: software wrote 0x6 to reset control register 0xCF9

[1] Making sure you're not a bot!
[2] https://www.phoronix.com/news/AMD-Report-Previous-Reset-Cause

edit: The value of this register is only printed at the next reboot, after the issue has occured.

1 Like

For the past two days the issue has been happening much more often (not sure if anything changed), so I did some digging, and this time I think I was able to find the relevant logs. Turns out I needed to run dmesg

x86/amd: Previous system reset reason [0x08000800]: an uncorrected error caused a data fabric sync flood event

1 Like

I’m still getting this intermittently but often enough to be mildly irritating and the mesg I get in dmsg after it’s happened is (at least the last time)

[ 2.012074] x86/amd: Previous system reset reason [0x00080800]: software wrote 0x6 to reset control register 0xCF9

1 Like

Hi,

Can anyone reproduced the problem with the psu unplugged?
I.e. you wish to suspend.

  1. unplug psu
  2. suspend laptop
  3. resume laptop
  4. plug psu back in.

I think the problem might be due to a power glitch and removing the psu removes a source of possible glitches.

1 Like