Framework 13 AMD crashing on different OSs

Hey everyone,
I own a Framework 13 AMD for 2 months (and I loved it so far…) which two weeks ago started crashing randomly (multiple times a day) while using it.

Specs:
AMD Ryzen 7 7840U
DDR5-5600 - 64GB (2 x 32GB)
WD_BLACK SN850X NVMe - M.2 2280 - 2TB

OSs:
I have Windows 10, Fedora 40 KDE and Pop! OS 22.04 installed in parallel and the crashes happen on all of them.

Symptoms:
Generally the OSs work fine, except the usual suspend issues on Linux (sometimes not waking up but rebooting instead or black screen after waking up).
When the OSs crash it is mostly happening while watching videos or gaming. Temperatures are completely fine (below 60°C, I most of the times run the energy saving power profiles in the OSs anyway, even when being connected to power, no fans going crazy or similar).

Crashing on Windows means:

  • flickering screen with artifacts (like for a faulty graphics driver)
  • instant reboot
  • no event logs for the crash except “kernel power 41”, no warnings beforehand or similar

Crashing on Linux is consistent between Pop and Fedora and means:

  • Screen goes black, sometimes chopped audio artifacts
  • instant reboot
  • no indicator in journalctl, nothing suspicious in the last log entries before the crashes happen

What I did already:

  • running memtester multiple times, tested 50GB of the 64GB RAM, the rest was allocated already and no issues were found
  • installing latest drivers for the OSs, performing firmware/chipset and BIOS update, issue still present

Any ideas how I can narrow it down? I fear it’s a hardware issue but I’m wondering how I would debug it properly.

Many thanks in advance!

Welcome to the community!

It does sound like a hardware issue, but it may well be a correctable one. I would suggest trying the method outlined here first; if that fails, scroll up a little to see my original post on how to force it. That technique seems to solve a number of hardware-related problems, large and small.

If the problem persists, you’ll likely need to talk to Framework’s support team.

Good luck! :crossed_fingers: And please let us know how/whether that works for you.

Interesting, I have a non-Framework laptop using the same AMD platform, and I’m observing the same kind of problems. RAM is ok as tested by memtest86+, nothing in the logs, just insta-reboot, vaguely reminiscent of what happens when some kind of overvoltage/overcurrent kicks in.

My system is GPD Win with a 7840HS inside, so yeah, but I’m just hoping against all hope the fix for the Framework could apply to my system, too.

I, however, have been so far unable to pin this on any particular graphics usage or CPU load (mine can reboot even when idle).

The only very finicky idea that’s on my mind right now is to set up continuous monitoring of all possible sensors and immediately sending them over the network so that hopefully a spike in temperature/voltage can be seen as close to the crash as possible.

I have tried setting up netconsole for the Linux kernel, but there wasn’t anything useful in the kernel log at the time of reboot. No warnings, no nothing. So very likely a firmware/hardware level protection kicking in.