[RESPONDED] After a hard reboot, I'm getting "ALERT! /dev/mapper/[hostname]--vg-root does not exist."

Hi all, a few days ago I received a new Framework laptop (13 inch AMD Ryzen 7 7840U). I installed Debian 12.5 bookworm. Major props to Framework, this all went shockingly well, and just about everything worked out of the box (wifi, camera, mic, fingerprint reader, …). I installed all my apps and set all my settings.

Today, I tried to do a google meet videocall, and noticed that it wasn’t using the mic on my headset. So I switched to zoom. I had previously installed and tried out a videocall on zoom, and verified that the camera and (built-in) mic worked. This time, I started zoom, and then plugged in my headset – and then the whole computer froze. So after a bit, I did a hard reboot.

When it came back, it had this prompt;

Gave up waiting for root file system device. Common problems:
- Boot args (cat /proc/cmd line)
 - Check rootdelay= (did the system wait long enough?)
- Missing modules (cat /proc/modules; ls /dev)
ALERT! /dev/mapper/lascaux--vg-root does not exist. Dropping to a shell!


BusyBox v1.35.0 (Debian 1:1.35.0-4+b3) built-in shell (ash)
Enter 'help' for a list of built-in commands.

(initramfs) _

Doing cat /proc/cmd line/ gives

BOOT_IMAGE=/vmlinuz-6.1.0-18-amd64 root=/dev/mapper/lascaux--vg-root ro quiet

I did try adding rootdelay=60 to that, and rebooting did have a 60-second wait time in it, but it then dropped into the same prompt. Doing ls /dev/mapper shows control, lascaux--vg-root and lascaux--vg-swap_1. Running fsck /dev/mapper/lascaux--vg-root -y says No such file or directory.

I’m assuming this was somehow caused by the hard reboot. What else can I try or check for diagnostics? It may be related to this post, but it’s not clear to me how.

Thanks all!

Well, meta-issue for the forum; I’m getting a 403 error when I try to post this question, and it only posted a fraction of what I wrote.

proc/cmd line (minus the space) triggers some malware detection in the forum software. Try adding that space to get around the 403.

1 Like

@Alex_Altair did you try booting into a live USB environment as @Ulmondil suggested in the thread you linked? To reiterate what they said, BusyBox offers a very spartan CLI environment which may slow or impede recovery.

A live USB of a distro you’re familiar with will offer a rich set of diagnostic and recovery tools, access to Internet and software repositories and a GUI if you’re not confident doing stuff from the shell prompt.

If you need help setting up a USB live booter, ask in this thread and I (or many others) will be able to help.

Dino

Thanks for mentioning it; I haven’t booted off a live USB yet because I don’t have a plan for what to do after that. I don’t know enough about partition stuff to understand what I should be looking at to confirm or disconfirm any hypotheses about what’s gone wrong.

That’s OK. Maybe the community can enhance your skill set and in the process figure out what’s broken and how to get it fixed.

Let’s deal with the figurative elephant first: is your Framework laptop your daily income-earning driver or are there other time pressures that require you get it fixed ASAP? I ask because forum-assisted fix-ups can be quite effective. Their nature however leads to lumpy progress and the hiatuses between steps can be frustrating, particularity if you’re under time pressure.

To add some substance to this post… possible reasons for the error, based on what you’ve told us:

  • NVMe mobo interface failure
  • NVMe SSD failure
  • LVM2 metadata corruption

The ‘partition stuff’ sits on top of these things. To determine which, if any, is a factor, booting a live environment and capturing the output of a couple of commands should provide some clues.

Paste the output of lsblk and/or inxi --disk. You won’t need sudo for either of these commands.

BTW you can safely ignore the unhelpful ‘common problems’ in the original error message. In my experience they’re not at all common. You’ve installed a vanilla, mature mainstream distro that has successfully booted at least a few times. Those messages are relevant for DIY kernel-makers and people grafting various components together to run on boutique platforms. Arguably if you’re capable of doing those things the error messages are equally meaningless.

Dino
[edit: corrected borked formatting]

Welcome to the community!

Everything Dino suggested is correct and recommended. Especially the latter part, which is spot on. Nothing I can add to the advice given, start with this is a good place to jump off from.

No worries about time pressure @truffaldino! I’m hoping to migrate onto the Framework as my main, but I haven’t yet. Even if I had to do a fresh reinstall, I don’t think there’s too much I would lose on the current drive. Largely I would like to understand why this happened, to make sure it doesn’t just happen again.

Here’s the output from said commands;

ubuntu@ubuntu:~$ lsblk
NAME                MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
loop0                 7:0    0   2.7G  1 loop /rofs
loop1                 7:1    0  74.2M  1 loop /snap/core22/1122
loop2                 7:2    0     4K  1 loop /snap/bare/5
loop3                 7:3    0 266.6M  1 loop /snap/firefox/3836
loop4                 7:4    0   497M  1 loop /snap/gnome-42-2204/141
loop5                 7:5    0  91.7M  1 loop /snap/gtk-common-themes/1535
loop6                 7:6    0  12.3M  1 loop /snap/snap-store/959
loop7                 7:7    0  40.4M  1 loop /snap/snapd/20671
loop8                 7:8    0   452K  1 loop /snap/snapd-desktop-integration/83
sda                   8:0    1 115.5G  0 disk 
├─sda1                8:1    1   4.7G  0 part /cdrom
├─sda2                8:2    1   4.9M  0 part 
├─sda3                8:3    1   300K  0 part 
└─sda4                8:4    1 110.8G  0 part /var/crash
                                              /var/log
nvme0n1             259:0    0 931.5G  0 disk 
├─nvme0n1p1         259:1    0   512M  0 part 
├─nvme0n1p2         259:2    0   488M  0 part 
└─nvme0n1p3         259:3    0 930.5G  0 part 
  ├─lascaux--vg-root
  │                 252:0    0 929.6G  0 lvm  
  └─lascaux--vg-swap_1
                    252:1    0   976M  0 lvm  

ubuntu@ubuntu:~$ inxi --disk
Drives:
  Local Storage: total: 1.02 TiB used: 1008 KiB (0.0%)
  ID-1: /dev/nvme0n1 vendor: Western Digital model: WD BLACK SN770 1TB
    size: 931.51 GiB
  ID-2: /dev/sda vendor: Memorex model: USB Flash Drive size: 115.46 GiB
    type: USB

Well. I… “fixed” it?

[Going into excessive narrative detail, mostly for the fun of it.]

After booting from the live Ubuntu USB and running those commands, I went back to some of the other of people having similar problems, to see if there were more ways I could get information. I remembered that when I had run fsck from the initramfs prompt, it didn’t find the partition, so it didn’t do anything. But now from the live USB I could see the partition. So maybe I should run it again?

I read a bit about fsck, and it seemed… unclear whether it was always “safe” to run. I didn’t actually want to do anything to the system yet, I just wanted to, y’know, check the file system. So I ran fsck /dev/mapper/lascaux--vg-root, looked at each thing it mentioned, and then said “no” to each fix. But, unsurprisingly, there was a point where there were a couple dozen messages about similar-sounding things, and I decided to quit out of fsck. Except, I wasn’t sure that was “safe”. So I googled it. At some point I opened up another terminal tab on (still on the live USB) and ran man fsck, only to get an error that said something like, bash: /usr/bin/fsck: Input/output error.

Um… that’s not good. Then it turned out that all commands gave this error. Then the terminal crashed. Then I tried to power down the live USB. It did not. It just, did nothing. This was the case for several minutes, so eventually I held down the power button.

After the laptop was fully off, I removed the USB stick and booted back up. Now, if I had sane expectations for computers, I would expect that it would boot into the same situation that it did earlier, which I started out this post with. Because surely running a command called “file system check” and saying “no” to all the repairs wouldn’t modify my system, right?

It booted into an initramfs prompt, except this time it has a message that said something like; /dev/mapper/lascaux--vg-root is messed up, and you’ll need to manually run fsck on it. Well, at least we can even see lascaux--vg-root at this point (which we couldn’t before). So I guess that’s progress. At this point I figure, sure, we’ll just run fsck /dev/mapper/lascaux--vg-root -y right here. What’s the worst that could happen.

So I did, and it declared that a bunch of stuff got fixed, and then booted me into my original debian install, where everything appears to work just fine.

I guess I’ll report back if similar-seeming problems happen again soon. If anyone knows wtf just happened, I’d enjoy getting a better understanding.

That’s awesome news @Alex_Altair.

Your experience with Ubuntu Live (especially /usr/bin/fsck: Input/output error) hints at the wake-from-sleep problem that afflicts some WD SSDs. This might also account for the original LVM2 and filesystem metadata corruption that caused your boot failures.

I encourage you to have a look at the (long) thread I’ve quoted below.

The contribution I quoted above is my attempt at cutting to the chase. The thread documents a journey. It might be worth reading. Firmware updates are not without risk and I’d like you to be fully informed.

Firmware aside, you can bolster your leap of faith that fsck did no damage. Run this command line.

$ sudo find /lost+found/ | wc -l
1

The expected result is as shown. Post back here if output is anything other than 1.

Dino