It’s been a long two weeks for me. First, my Intel 12th gen motherboard died, and support couldn’t identify the cause. Now, I’m experiencing freezes and crashes when booting into Ubuntu installations from my two SSDs in external enclosures on a borrowed FW 13 AMD 7040.
It seems like the issues didn’t start immediately—maybe after about a week, but I might not have noticed right away.
Currently, at some point after switching the laptop on, I get I/O errors. Then my system either crashes or becomes unresponsive. None of this happened on the Intel 12th gen, and both SSDs work fine from the same enclosures when booted from other PCs or from the NVMe slot on the 7040.
OS release: 24.04.1-Ubuntu 6.11.0-26-generic
Framework product: AMD 7040 Series
Did anyone encoutnered those issues?
I have bought myself the 7040 replacement but I probably will have to return it now.
I’m driving the exact same setup on a daily basis:
Ubuntu24 on external SSD (WD P51 Gamedrive),
LVM2+LUKS (from using the installer) on 7840.
So far, everything fine for the 1.5 years I’m using the system.
Update: Sorry, I forgot I went on testing ZFS with this installation, instead of lvm2… So not fully identical.
Nonetheless:
Can you still see your journalctl and check for suspicious errors?
(You can also hook up the drive to a working machine, and inspect the journalctl logs there)
Did you already try different ports on the laptop?
I’ve found a lot of external NVME → USB enclosures have disconnect issues. The only ones I can say for certain that I haven’t had issues with are these:
OWC Express 1M2 Portable NVMe Thunderbolt (USB-C) SSD USB4 - This works with both Thunderbolt/USB4 with USB fallback so it will work with both Thunderbolt and regular USB ports. If you plug it in via thunderbolt it will show up as a /dev/nvme* disk and if you plug it into usb it will show up as a /dev/sd* disk.
The OWC enclosure is expensive but it “just works” and I haven’t had any problems with it. I use the Sabrent enclosure if I need to temporarily throw an NVME drive into an enclosure to do something quickly with it since it’s toolless.
But I’ve gone through many RTL9210B, JMicron, and other enclosures that just seem to have random problems and I’ve never figured out why. My assumption is that a lot of 2280 NVME drives overwhelm the power that the enclosure is able to provide and they just die. But I’ve also had them disconnect when the drive is seemingly idle, so it could be something else.
On another note, if you just want an external drive, and don’t want to bother with your own enclosures, Sabrent Rocket Nano drives work extremely well. I have several of them and haven’t had any issues at all.
P.S. I’ve had these problems with both Framework and non-Framework laptops, so it’s not really Framework specific.
Edit: Note if you do end up using or switching to a thunderbolt enclosure for your boot drive, remember to add this to the linux command line on boot with GRUB or whatever you’re using to boot: thunderbolt.host_reset=0. If you don’t then it won’t be able to find the drive after your initramfs loads…
I tried to find out which controller the drive is using, but without success. I also noticed it is actually called “WD Black P50 Game Drive SSD 500GB”, seems like I misremembered P51/P50 earlier.
The idea was to run it in the 7040 machine until it crashes. Now hopefully some information has been written to the journal.
You can plug it into another machine where it runs fine now (do not boot from it, just hook it up and mount it).
You can then inspect the journalctl like this (adjust the directory to your path): sudo journalctl -D /mnt/externaldrive/var/log/journal/ -b-0
As you suggested, I run the machine until the issue occurred. It worked for 3 hours longer than I expected it would, but, unfortunately, still got IO issues at the end. Unfortunately it crashed while I was away and I couldn’t see the error logs from the Login screen. The last logs of any relevance are:
Jul 03 18:44:08 spirit-laptop kernel: usb 5-1: USB disconnect, device number 2
Jul 03 18:44:08 spirit-laptop kernel: usb 5-1.4: USB disconnect, device number 3
And after those the system was logging fine a for a few more minutes.