FRWK16 - Random Crash then Reboots

:wave:t2: G’Day lovely people!

Okay - so I’ve been trying to hold back on this for a while but I think it’s time to post - loving the F16 except it isn’t that stable - I get a lot of crash-then-reboots.

I’m tech savvy. I know about drivers. I know about win11 and other OS’s. Bios + Firmwares, etc. I have a strict policy of “docker before native install”… if i can dockerize it, then it’s used in docker. I’m a minimal installer.

So my FW16 has no custom hardware or overclocking etc.

I think it’s related to SLEEP.

I close the lid and the unit goes to sleep. But a large number of times when I open it … it’s “black” for ages … then the reboot happens. And a part of me, dies. Lame … but it’s true.

In 2024 (for the last few days left), this poop still happens.

Now - here’s the kicker. I’d be OKAY with it happening (and everything getting lost, etc) if there was a nice and easy way to diagnose this. To HELP the FW team figure stuff out.

But I really don’t feel like there is.

I’m sure this is a “skill issue” on my side but i feel that if this is happening to me - it’s happening to others.

Please - is there something we can all do together to make this easy to diagnose?

Surely if there’s a number of low level steps, I can help consolodate those into a simple Windows App (i can code) and help FW get to the bottom of this.

SURE! it might be an AMD issue with their drivers. Prolly is. It’s always a driver issue. (Or a DNS issue if u’re still reading this). But at least we can try and come together and help fix this frustration.

I love my FW16. I just want it to never crash.

Help me Obj-Wan Kenobi. You’re my only hope.

-PK-

1 Like

My Framework (running Windows 10 unfortunately) “crashes” every single time when it tries to sleep or hibernate. It will do the sleep or hibernate properly, but when woke up it cold boots instead.

I believe it has to do with graphics driver. Even installing from framework’s, it eventually stops working.

BUT FIRST. BIOS 3.0.5 is out. Do that.

If you know the cause of the crash (which is sleep), just don’t!


ignore that BTD crash, the AMD software crash, and the Windows “crash” (due to battery running out, and failure to resume from hinerbation)

I believe at least a small portion of that is due to any virtual machine. Memory protection prohibits either sleep or hibernation, I forgot which one. At least that’s for Windows.

And supporting the old S-states is probably also a no-go thanks to AMD 7040-series chipset.

TIL: there’s a thing called “Reliability Monitor”.

Here’s mine:

i’m dying :frowning:

edit:

  • yes i’m on 3.05

AMD Ryzen 7 7840HS w/ Radeon 780M Graphics 3.80 GHz
32.0 GB (31.3 GB usable)
64-bit operating system, x64-based processor

Windows 11 Pro
Version 24H2
Installed on ‎26/‎11/‎2024
OS build 26100.2605
Windows Feature Experience Pack 1000.26100.36.0

We’re digging into this now and will come back with asks on the types of logs that would be helpful. Thanks for helping us debug it.

One part of this is that for the Windows 10 report, we expect that AMD would consider that an unsupported configuration for the chipset and graphics drivers. The Windows 11 24H2 report is helpful though.

1 Like

Could you also share SleepStudy reports?

powercfg.exe /SleepStudy

1 Like

Sure! These forums only allow images, so here’s an external link: Sleep Study report · GitHub

That study report is interesting.
The red ones appear due to this device:
AMD Radeon™ RX 7700S (_SB.PCI0.GPP0.SWUS.SWDS.VGA)

So the GPU maybe causing problems, and not sleeping.
I hazard a guess.

  1. GPU stays on, while one thinks the laptop is sleeping.
  2. You come to wake it up from, the battery is dead, so it powers off the laptop.

A confirmation of this, would be when the FW16 crashes and then reboots, how much battery does it have left?

I usually keep the laptop on mains power.

There has been one or two times i’ve worked in bed and i’ve closed the lid when the warning comes up saying the power is low. I cannot remember if it crashed/rebooted after I tried to ‘wake it up’ with low battery.

Unless i’m in bed, I would put the machine onto mains power asap when that warning comes up. I think I sometimes do:

  • low battery warning shows
  • close lid
  • put it on mains power
  • come back later…

not sure if this helps. Next crash, i’ll have a look at the battery percentage after the reboot.

I will say, in the mean time, try to not put your computer to sleep

Mine came out completely clean, because I haven’t put my windows install to sleep for a while. (maybe 2+ weeks)

But let’s go (pushes power button)

Well. My 16 is stuck in sleep and cannot wake up. Sleep study showed nothing, either.
Will try with Win11.

This will put windows in “not connected standby” (it will still be called connected standby, but it will be a deeper sleep state. Compared to AC, which doesn’t actually stop the processor)

It’s more that Windows don’t know what to do with the device when it wakes it up. As a result it cold boots.
This is almost exclusively a driver/firmware issue. Windows’ flaky Modern Standby does not help, either.

Would you also be able to try:
powercfg /requests

Based on the SleepStudy, there are instances of the system not entering Standby, so something is holding it awake.

1 Like

powercfg.exe /requests

DISPLAY:
None.

SYSTEM:
None.

AWAYMODE:
None.

EXECUTION:
None.

PERFBOOST:
None.

ACTIVELOCKSCREEN:
None.

Do i need to do it at some specific time? or just after a wakeup?

Huh I got W11 to lockup. Goes to sleep, but didnt wake up.

On second note, seems like I didn’t install the chipset driver. Will do that.

1 Like

If you open Device Manager, are there devices with yellow triangle warnings?

No. The drivers are all installed, and they all work.

This is of the most interest. Because clearly they are not.

I remember disabling something like “PCIe link state power management” and change it ti “none”, but that should not have affected anything.

More interestingly this continued to happen even after I updated the drivers. Previously they showed up with “Microsoft” drivers, now they showed up as AMDs …
ok the driver installer did nothing. I guess it’s Windows 11 kerfuffle?

what is on the PCIe port 8 and port 1? Port 1 seem to be a drive (ok cool, Samsung PM991a, or Western Digital SN 530),

Samsung is attached to PCIe port of bus 0 device 1 function 2
Killer is dev 2 func 2
WDC is dev 2 func 4
dev 8 func 1 is Radeon 780M, audio, and all the USB stuff…
8 func 2 is IPU, 8 func 3 is more USB 3.1, and USB 4.

ok that is all ACPI … stuff. Samsung is on PCI bus 1, killer on bus 2, WDC bus 3. the GPU and USB 3 is bus 194 (prob chipset). So yeah. This make no sense.


The symptom is … well, I put it to sleep, then I wake up to find it in the “unlock drive” page in BIOS, meaning it restarted (without cleanly shutting down)


I can do a clean install once I get back home, for more troubleshooting, I feel like this may be necessary to draw conclusions. I will be back to home (with all the drives and stuff) around Jan 6 or 7.

2 Likes

Thanks for helping debug this. A lot of the team is off for the holidays right now, so we’ll get more traction on this in early Jan.

2 Likes

I think one of the difficulties in trying to diagnose this problem is the lack of information after the fact. No Log messages to help, that sort of thing.
I have a EC CCD (able to see the EC serial console even when the laptop is switched off.)
I am assuming the EC is what actually switches off the laptop, so it must be told to switch it off.
Unfortunately, the EC does not yet log a message to its console regarding the cause of the shutdown or the cause of a sleep wakeup.
For example, you see a message saying the EC is being asked to change state (on to off) but you don’t get to see what asked it to do that.

Also, I don’t have a table explaining what each Port80 code means. (Something output on the EC console).
If I get the time, I will be making some changes to my EC code (we have the source code luckily), so that it starts outputting these extra log messages so that we can get closer to the source of the problem and can understand what triggers the shutdown and wake.
My guess is that there is a bug in the ACPI state machine, and when in-determinant states are reached, the safest thing to do is switch off.

1 Like

I requested this internally and we’ll post it publicly.

3 Likes

@nrp
Re Port80 codes, we would like that a lot. :slight_smile:

1 Like

And this on my windows 10. Sort of the same, but also slightly different.

It’s the least I could do.

Merry christmas and happy new year.

excellent work.

Hi Everyone,
It would be great to track this in Issues · FrameworkComputer/SoftwareFirmwareIssueTracker · GitHub
So we can keep updated through the lifecycle of this issue.

Since you have port80 codes, it may be useful for us to see where things are getting stuck. We can see if things are getting stuck in ACPI methods through some port80 logging.

Other general stuff we would like to know is:
Windows version
Driver version for AMD graphics and chipset drivers.
Any installed hardware such as SSD/expansion card devices/attached peripherals, etc.

1 Like