Frequent Freezes & Crashes on AMD 7040 with External SSDs

Hello Framework Community,

It’s been a long two weeks for me. First, my Intel 12th gen motherboard died, and support couldn’t identify the cause. Now, I’m experiencing freezes and crashes when booting into Ubuntu installations from my two SSDs in external enclosures on a borrowed FW 13 AMD 7040.

It seems like the issues didn’t start immediately—maybe after about a week, but I might not have noticed right away.

Currently, at some point after switching the laptop on, I get I/O errors. Then my system either crashes or becomes unresponsive. None of this happened on the Intel 12th gen, and both SSDs work fine from the same enclosures when booted from other PCs or from the NVMe slot on the 7040.

OS release: 24.04.1-Ubuntu 6.11.0-26-generic
Framework product: AMD 7040 Series

Did anyone encoutnered those issues?

I have bought myself the 7040 replacement but I probably will have to return it now.

Best regards,
Alexander

Could it be because my drives are encrypted with LVM2

Hi Alexander,

I’m driving the exact same setup on a daily basis:
Ubuntu24 on external SSD (WD P51 Gamedrive),
LVM2+LUKS (from using the installer) on 7840.
So far, everything fine for the 1.5 years I’m using the system.

Update: Sorry, I forgot I went on testing ZFS with this installation, instead of lvm2… So not fully identical.

Nonetheless:
Can you still see your journalctl and check for suspicious errors?
(You can also hook up the drive to a working machine, and inspect the journalctl logs there)

Did you already try different ports on the laptop?

Best,
Frostregen

I’ve found a lot of external NVME → USB enclosures have disconnect issues. The only ones I can say for certain that I haven’t had issues with are these:

The OWC enclosure is expensive but it “just works” and I haven’t had any problems with it. I use the Sabrent enclosure if I need to temporarily throw an NVME drive into an enclosure to do something quickly with it since it’s toolless.

But I’ve gone through many RTL9210B, JMicron, and other enclosures that just seem to have random problems and I’ve never figured out why. My assumption is that a lot of 2280 NVME drives overwhelm the power that the enclosure is able to provide and they just die. But I’ve also had them disconnect when the drive is seemingly idle, so it could be something else.

On another note, if you just want an external drive, and don’t want to bother with your own enclosures, Sabrent Rocket Nano drives work extremely well. I have several of them and haven’t had any issues at all.

P.S. I’ve had these problems with both Framework and non-Framework laptops, so it’s not really Framework specific.

Edit: Note if you do end up using or switching to a thunderbolt enclosure for your boot drive, remember to add this to the linux command line on boot with GRUB or whatever you’re using to boot: thunderbolt.host_reset=0. If you don’t then it won’t be able to find the drive after your initramfs loads…

1 Like

Hi Frostregen,

What controller does your external drive have? Mine is: Realtek Semiconductor Corp. RTL9210 M.2 NVME Adapter . Maybe that is the issue?

Will do, however it usually crashes and then doesn’t read or write at all.

Curious do you know what controller those have?

Nope, works fine everywhere including another FW 13. Just 7040 is cursed.

I think at this point maybe I should try with an AI model. Buying a new enclosure will about cover the price for me anyway.

I tried to find out which controller the drive is using, but without success. I also noticed it is actually called “WD Black P50 Game Drive SSD 500GB”, seems like I misremembered P51/P50 earlier.

The idea was to run it in the 7040 machine until it crashes. Now hopefully some information has been written to the journal.
You can plug it into another machine where it runs fine now (do not boot from it, just hook it up and mount it).
You can then inspect the journalctl like this (adjust the directory to your path):
sudo journalctl -D /mnt/externaldrive/var/log/journal/ -b-0

Best,
Frostregen

1 Like

forgot to answer that.

Yes, I did try and result is the same.

As you suggested, I run the machine until the issue occurred. It worked for 3 hours longer than I expected it would, but, unfortunately, still got IO issues at the end. Unfortunately it crashed while I was away and I couldn’t see the error logs from the Login screen. The last logs of any relevance are:

Jul 03 18:44:08 spirit-laptop kernel: usb 5-1: USB disconnect, device number 2
Jul 03 18:44:08 spirit-laptop kernel: usb 5-1.4: USB disconnect, device number 3

And after those the system was logging fine a for a few more minutes.

This is really unfortunate because I already bought the same model and now will have to return it. But it seems that other AMD mainboards are also affected: AMD Framework and NVMe SSD Enclosure Compatibility Investigation - #97 by Craig_Hesling