Will this work if the USB drive is the boot drive?
It should work for boot drives as well.
I have some new actual evidence.
After my machine ran itself a bit hot (pipewire went haywire, still not sure why), I shut down, waited a few seconds, and then rebooted, resulting in the errors below and a drop into emergency mode. So, it found the drive, booted it, but then lost it again. Below is the photograph I managed to take of the screen at the time.
After shutting down again, and waiting a couple of minutes for it to cool down, I was able to boot normally.
We recommend using this for as a media or backup drive, not as a boot drive.
PSA - this is to be used as a media drive or a backup drive (TimeShift or Deja Dup or Rsync from the CLI. Don’t use this as an install drive please. it’s USB and not likely to be a good time.
Except for the disconnect issues, speed has actually not been a problem at all. It ought to be a perfectly fine boot drive, honestly…except that the USB is not reliable!
And…here’s the thing.
I need an external boot drive. There are reasons, and I don’t need to go into them. It’s a requirement for something I’m doing.
Now, I can just get an actual USB external drive–there are some high-quality portable SSDs in the world, now, that would probably serve adequately.
But if the USB is not going to be reliable, then that’s not a solution either.
I would also point out that the advertising copy for the module in the marketplace explicitly says that it is usable as a boot drive.
This one of the reasons I prefer to use it for data storage.
I hear that, I actually have had use case scenarios where I was in the same boat. There are things we can try, but it’s purely as is in terms of running an OS off it.
This needs to be corrected as with Windows, it’s not officially supported: Windows 11 Won't Reboot but can Shutdown - #7 by TheTwistgibber
It’s doable for Linux, but it’s best seen as your mileage may vary.
All of that said, your errors appear to be bad sectors and we can attack this way:
-
Understand that if you don’t have any data backed up from this drive, now is the time because there is always risk of data loss. Live USB, secondary USB storage device, backup Home.
-
sudo umount /dev/sdb then sudo fsck -y /dev/sdb1 (or 2 or whichever is applicable) (To check for and -y to correct errors).
This process will be slow, very slow.
Faster approach:
-
Boot to live USB, gparted, create a brand new partition on the device.
-
Reinstall fresh (although I am not a fan of trusting USB to be flawless for OS usage).
There are no bad sectors here. The bad-sector errors were because the USB-disconnect was kicking in after it tried to boot. It later booted cleanly. I’m using ext4 as my filesystem, which has been nicely resilient in terms of journaled recovery when I have in-flight issues.
Also, as I’ve pointed out, the device sometimes is invisible to the BIOS, not just the operating system.
This may be a bad card. Please reach out to support for help with this if it’s not showing in BIOS. Please indicate that it’s not showing up reliably.
Except…USB needs to be reliable, right? I mean, on a device where all the ports are really USB-C under the hood, USB needs to be pretty much gold-plated perfection.
Look, I wanna be clear here. I’m not trying to be combative. I love Framework, and I love what it’s doing. I want to shout Framework’s name’s to the heavens and tell all my friends to buy at least one. Except that if the USB is flaky, then, I really can’t recommend it to anyone but hobbyists who don’t mind occasional flakyness.
Sharing my experiences across Linux usb booting in general, not merely the laptop. Booting an OS from USB is doable, but not recommended on any computer in my personal experience (edit) outside of live booting a distro. I’ve done it across a spectrum of operating systems and computers. It’s flakey and while doable, is not recommended by me based on those experiences.
So hopefully that clears that up. That’s my recommendation.
Moving forward, in your case, you will want to reach out to support for an expansion card replacement.
You definitely will want to have the card replaced.
I have to disagree here; you must be doing something wrong. I’ve run various distros including Arch, Debian/Ubuntu, and NixOS off of external USB drives without issue for years and without issue; and these are actual installations, not live images.
For instance, I normally run Arch and don’t like having 32-bit libraries installed, and I also don’t like how some games litter my filesystem with files and directories in inappropriate places, like directly in my home folder. So, I solved this problem by using an external drive for a “gaming” installation, which has Steam, Lutris, and other things setup just right, and is used only to play games from Steam, GOG, and some one-offs.
This not only solves my 32-bit and litter file problems (as I don’t care where it puts files in this installation…), but I can also move this installation from machine to machine by just plugging it in and booting off of it. If I feel like playing on my laptop, I can do that. Do I want just a little more performance? I can plug in my eGPU. Or do I want to go “balls-to-the-wall”? Well, then I can boot it up on a more powerful machine and use that instead.
I’ll run a machine for hours that way without issue, and haven’t had an issue with disconnects.
And this is also using generic NVME-to-USB found on Amazon of dubious quality, with SSDs from various manufacturers including Samsung, SK Hynix, but also sometimes Inland and other lower-tier manufacturers.
You should be able to boot and run a Linux distro stably off a USB stick; if you can’t there’s something wrong with either your computer or your usb stick (or possibly even your cable).
I also have not had any problem over the years running Linux off of an USB. What i have run into is needing to remember to exempt the drive from autosuspend power rules. My first suspicion in these situations is that the power management is for some reason suspending the drive i.e. tlp autotsuspends it, but another potential item here is possibly insufficient power. I am very interested in what the total power draw is on the laptop while stress testing in an OS on the expansion card drive. Since the 11th gen porcessors max out at 60w and the framework power supply is 60w I suspect there may be a situation where this causes the errors seen. I had very similar errors show up while testing underpowered docks on my 12th gen Framework. These errors did not immediately show, but only cropped up after multiple tests with a variety of peripherals connected. These errors would afterwards remain persistent until I either removed and reseated the expansion cards, or disconnected the battery in the BIOS and then held the power button down for 30 seconds after having plugged the power delivery back in. I do attribute some of this obviously to firmware that needs improvement, but I think the main culprit is power spikes of short enough duration where the switch from power supply to battery to power supply occurs with sufficient frequency to essentialy confuse/lose or miss a power event to where the firmware gets scrambled for lack of a better word and gets locked in the wrong persistent state.
Using the drive as a boot drive is intended behavior. However, this problem occurs even when used as a light duty drive for storing Word documents to access from both operating systems on the internal SSD in dual-boot configuration.
This is interesting. How would we exclude the drive from TLP selective suspend? I’ve had issues with getting a full disable on selective suspend to take hold in the past.
You could add it to the USB_DENYLIST parameter: TLP/tlp.conf.in at main · linrunner/TLP · GitHub
@Be_Far , @ryanpetris got it right. TLP had a number of changes form version 1.4 to 1.5 and in this case what used to be USB_BLACKLIST became USB_DENYLIST. Another thing to check when using TLP is to make sure no other services are trying to do a similar thing. The most common culprit in most distros is power-profiles-daemon. You can either mask this service in slightly older distros or uninstall it in most modern distros, though masking it still works. TLP does everything power-profiles-daemon does along with a long list of additional features it just does not have a simple drop down list and generally needs to be edited preferably in /etc/tlp.d/00-nameofhostorspecialbitofconfig.conf where 00 can be replaced with any number from 00-99 like any other files stuck in a /etc/application-servicename.d directory. This way when TLP updates your config is not overwritten.
This is likely to improve your experience but I believe to have a flawless experience it needs to be coupled with the expansion card having the heatsink in it like it currently does, and you should have a power supply that provides sufficient power under ALL circumstances whihc a 60w supply does not, in fact every 65w rated charger I tested also failed in this aspect. Since the next grouping up was 85w for some hubs I tried those as well and they also do not achieve providing only 65w maximum. All 96w docks I tested successfully provided sufficient power to my 12th gen (maxing out at 75w of draw/i.e. the laptop got whatever it wanted power wise). Also 100w PD chargers (I am using an Anker 737 chragew with 100w PD) should work fine as well.
Since I don’t have an 11th gen I could not test directly and I have not had a confirmed PL2 rating on the 11th gen porcessors in the Framework. With that in hand you should be able to arrive at your own conclusions regarding PD and 11th gen Frameworks. My suspicion is of course that it is just falling short of what it really wants and needs if using the standard Framework charger under heavy load whether it is sustained or very short in duration.
@nadb Thank you. That’s comprehensive, especially the power-delivery piece. It would explain why I am more likely to see the issue when I’m “out and about” with the laptop and running off the Framework 60w brick, or the battery!
At home, I’m running off a CalDigit TS3 dock that was intended to power an Intel-era Core-i7 MacBook Pro, so power delivery is not the problem there, or at least, it really shouldn’t be.