Wondering if there are any new ideas about what might be causing this. I have the same issue that I believe only happens when I am plugged into an Aorus gaming box eGPU.
Hey everyone, quick update from me:
The instant reboot issues were gone for about a week and then returned. I needed to reach out to FW support and meanwhile got my mainboard and CPU replaced.
It was not the greatest customer experience since it took me weeks to convince them that it is a hardware issue but eventually they sent me replacement parts and after replacing the board I - so far - did not have a single instant reboot in 1 1/2 months.
Running dual boot with Fedora 40 Plasma and Win 11 right now (with other distros/OSs you will not get support at all) and it finally feels like I’m having a reliable machine in hands.
I hope you guys get your issues sorted out.
I’ve been dealing with a very similar issue on my Framework 13 AMD and wanted to share my experience so far. I’ve been going in circles with support, and it’s been incredibly frustrating. Here’s a rundown of my specs, symptoms, and only some of the things I’ve tried:
My Setup:
- Processor: AMD Ryzen 7 7840U
- RAM: 64GB DDR5-5600 (tested with both official Framework RAM and third-party modules)
- OS: Ubuntu 24.04 (clean stock install, also had the issue with 22.04)
Symptoms:
- Random Restarts: The laptop will suddenly restart without warning, sometimes under load like video playback and sometimes under minor load like web browsing, with no clear error logs or
journalctl
entries before the crash. Sometimes these crashes happen after days of normal usage. Sometimes it happens multiple times a day. - No Prior Issues on Intel: I previously used an Intel mainboard without any stability problems, so this seems specific to the AMD setup.
What I’ve Tried:
- RAM Testing: I’ve done extensive testing with
memtest86+
, trying multiple configurations:
- Tested both third-party and official Framework RAM.
- Swapped RAM sticks between slots, used single sticks, and ran
memtest86+
with each configuration. All tests passed without errors.
- Firmware, BIOS, and Driver Updates: I’ve made sure everything is up-to-date:
- Updated to the latest BIOS version recommended for AMD.
- Installed the latest kernel and AMD-specific drivers for Ubuntu.
- USB-C and Dock Testing:
- Tried multiple USB-C docks and hubs (Anker 555, Dell D6000, Vava, Caldigit TS3 Plus).
- Tested with and without external monitors attached, and even with no peripherals connected.
- Crashes still happen randomly, often while docked, but also at times with no external connections.
- Testing with Different Chargers:
- Switched between a 96W MacBook charger and a 61W model to see if charging affected stability, but the crashes continued on both.
- Miscellaneous Troubleshooting:
- I tried running with the bezel removed to rule out display issues.
- I also tried different USB-C expansion slots, as well as charging from various ports to see if it changed the crash frequency.
Framework Support suggested using a Live USB environment for extended testing to rule out software issues, but since this is my only laptop, it’s not practical to operate from a Live CD for multiple days. This seems like overkill since I’ve already done a fresh install. Given the level of testing I’ve already completed, I’m becoming more convinced this is a hardware or electrical issue with the AMD mainboard.
This issue has caused me to cancel my preorders for the upgraded display and a framework 16. I can no longer recommend getting an AMD framework laptop, this is by far the most unstable laptop I’ve ever had.
This sounds identically to what I’m going through. I’ve tried so many configurations of USB peripherals but doesn’t seem to make a difference; any type of charge massively increases the chance of a random crash. RAM sticks, SSD, WiFi card and expansion cards all isolated from the issue. I was able to quickly replicate the issue running the Fedora 40 live USB.
Which batch did you get yours from? Seems like there’s a number of these popping up this month, I’m also convinced there’s a hardware/electrical fault in the mainboard and likely from the September batch.
Mine’s from Batch 2 and shipped late October of last year. Framework support has now asked me to provide more logs, current system details, and to reset the BIOS to defaults again. I’ve been dealing with this for about a year, and while the laptop had a lot more issues initially (like display artifacting with hard freeze crashes), some of those were actually resolved with BIOS or firmware updates over time. I put up with the random crashes for a while, hoping they’d eventually be fixed too, but I’ve recently lost work from these sudden restarts, and I’m starting to lose patience.
I’ve been a big supporter of Framework and have already upgraded my mainboard twice, but this AMD mainboard has been really unstable from the start. At this point, I’m also convinced this is a hardware or electrical issue specific to the board, and it’s not something I can troubleshoot much further on my end.
Let’s hope we can get some clarity from Framework soon—please keep me posted on any progress on your end!
Damn, it’s concerning that you have a much older AMD unit with the same problem; I was hoping it might just be a bad batch. I’ll have to seriously consider swapping to Intel in the hopes of improved stability.
It feels like they should swap out the mainboard at this point.
Same(-ish?) thing happening to me. Seems like it’s more likely to happen under load, but I’ve had it happen before OS has even loaded as well, Graphics glitches on the screen and total freeze, any audio that was playing repeats probably the last buffer sent to the device.
After running memtest86+ for multiple loops, it seems like there are some issues.
Interestingly, it doesn’t fail at same addresses or nearly every test loop. And it very rarely fails on the first loop. The errors seem to come at batches, and at worst, I got over 300+ errors.
I am going to try and swap my two RAM sticks around and run the memtest overnight.
Dealing with this issue also,
Support eventually replaced 1 stick of RAM they decided was bad, on the first boot I had a crash in 20 minutes.
I reached back out and haven’t head back in a few days, wondering what will be tried next.
Mine is also an earlier batch
Ok, swapped the two RAM sticks around, still get the same addresses for errors, usually in the 0x000700000000-0x0007FFFFFFFF range.
Starting to look like a motherboard issue to me.
Have you tried cleaning the contacts on the dimm with alcohol? DDR5 (especially as sodimm) is extremely sensitive to clean contacts.
@olenananas
I don’t know how AMD arranges RAM addresses vs RAM chips.
It might be interleaved. I.e. byte 0 on RAM chip1, byte 1 on RAM chip2, byte 2 on RAM chip1, byte 3 on RAM chip2.
I Believe ECC is used at L1 Cache, L2 Cache, L3 Cache, on the DDR5 RAM chip.
I don’t believe ECC is used on the data lines between the CPU and the RAM chip.
In which case, it might be useful to find out how AMD arrange the RAM addresses, and then look in the errors for the offsets of the mismatching data, and work out from their which RAM chip has the problem, or if like you said, the mismatch does not move with the RAM chip, so must therefore be motherboard based problem.
I have always thought the ECC that also covers the data lines between chips is the way to go, but Intel and AMD only use that on Servers as far as I know.
The cause might also be RF interference. Do you have any mobile phones or washing machines or other electrical items near the laptop when doing the memtest ?
I found out how the AMD 7840 maps physical address to RAM chip. One RAM chip is channel A, the other is Channel B. The LSB bit of the 64 bit address selects the Channel A/B.
So, it is actually as I describe above. Byte 0 → Channel A, Byte 1 → Channel B etc.
Looking at the data from the pics you posted:
1e21 679d 38c6 5c6b < Expected
e121 679d c7c6 5c6b < Found
0741 6ba0 6cf4 4a2c < Expected
f841 6ba0 93f4 4a2c < Found
cc7b c3f8 0b75 4987 < Expected
337b c3f8 f475 4987 < Found
dcb4 2aee 0ad6 c2eb < Expected
23b4 2aee f5d6 c2eb < Found
The errors are all when LSB=0, and OK when LSB=1. See numbers in bold for the errors.
So, this points to a fault on one of the DDR5 RAM channels.
So, if you swap the RAM chips, the errors should move from being on LSB=0 to bad on LSB=1.
But, looking at some values further down, the LSB=1 are bad.
E.g.
Via CPU 3 == LSB=0 bad
Via CPU 9 == LSB=1 bad.
There is not a big enough sample size there though.
I therefore suspect a Motherboard change is needed, as it might even be the CPU having the bug, but the CPU is soldered on the MB.
I’m still going to run through the memtests with either single chip installed in both slots, that should give enough info to see if this actually is a mobo problem or if one of the sticks is faulty (which seems a bit unlikely as I’d expect far more consistent failures in that case).
And frankly, if running electrical items near FW laptop blows up memtest, the laptop (or any laptop where that is enough to cause memory failures) is not fit for market, and I’m pretty sure Framework engineers would agree.
Solid approach.
It’s less that running them disturbs a properly working one but that it causes a marginal one to fail.
I am really looking forward to lpcamm, ddr5 sodimms are running pretty close to the limit even if they are working, those traces are just too damn long.
I found out how the AMD 7840 maps physical address to RAM chip. One RAM chip is channel A, the other is Channel B. The LSB bit of the 64 bit address selects the Channel A/B.
Oo, very nice. I can check this with the rest of my results. Thanks a bunch!
It’s less that running them disturbs a properly working one but that it causes a marginal one to fail.
Yea, this seems fair.
I am really looking forward to lpcamm, ddr5 sodimms are running pretty close to the limit even if they are working, those traces are just too damn long.
I wish consumer CPUs/memory would get ECC by default. Doubt that’s going to happen anytime soon.
Same, especially since there aren’t a whole lot of parts missing. Unfortunately it seems like you can’t have ecc in lpcamm and you can’t have lpddr in camm (that can have ecc) so that sucks. From the 2 I personally would go with lpddr but sucks that you have to choose.
Quick update: Framework support recently asked me to reset the mainboard by pressing the center button 10 times, which I did, but the random restarts are still happening. I also tried removing the internal Wi-Fi adapter and using a USB dongle, but that didn’t help either.
Although I’ve already run memtest86+ overnight twice without any issues, I might try it again given all the recent focus on RAM testing. However, since Framework said they’d escalate this issue, I expected more concrete feedback on the logs I submitted—instead, I’ve only received more general troubleshooting steps.
At this point, I’ve exhausted most options on my end, and everything still suggests a hardware or mainboard fault. Hoping they’ll take the next step soon. I’ll keep everyone posted.
Hi, I think we have a discussion for the same issue on this post. I ran through a roughly two months process with Framework support for diagnostic, leading to a mainboard replacement.
The point is that it could be barely anything before the mainboard - so you have to test everything to come to the conclusion of a faulty mainboard.
To help, I gave a summary of the steps I went through here.
Hopping this could help you find faster what you have, and eventually shorter the support processes.