In light of the linux 5.19.12 boot issue, I thought I would share how I’ve set up my (arch) linux machine to be durable and recoverable.
Disk layout
I have two partitions, the /efi partition, and my LUKS encrypted partition. One note is that I made my /efi partition 1gb so that I wouldn’t run into any size issues later.
sudo parted -l
Model: SHPP41-2000GM (nvme)
Disk /dev/nvme0n1: 2000GB
Sector size (logical/physical): 4096B/4096B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1 1049kB 1074MB 1073MB fat32 boot boot, esp
2 1074MB 2000GB 1999GB luksy
LVM
I previously had used btrfs for snapshots, but due to performance issues, switched to LVM instead. Besides snapshots, it’s much easier to create, resize, or otherwise modify partitions.
So, I have a /root and a /home logical volume
Full disk backup with dd
Every so often (~monthly), I boot my laptop from my USB drive to the archiso image (using ventoy).
I then make a full-disk copy of my laptop to an external drive with a command like dd if=/dev/nvme0n1 of=/dev/sdc bs=16m conv=fsync status=progress
Benefits:
- Simple technology, not much to go wrong (assuming you don’t clone the wrong disk)
- Backups are encrypted since my main partition is encrypted
- Verification is easy: I can just boot off the external hard drive and verify the backup works
- Restoration is straightforward: dd the data back to the laptop
Cons:
- Takes a long time
- Manual process of rebooting, typing commands into archiso
- Doesn’t help if the house burns down
Cloud backups (restic + backblaze)
I also sync my /home directory to backblaze with restic on an hourly basis.
Benefits:
- Encrypted backups
- Only need to upload what changes
- Can retrieve multiple versions of what’s changed
- Reap old snapshots
Cons:
- Takes a bit of fiddling to set up yourself
- Not a full system backup, just my user files that I care about
Snapshots
Updates break things - sad fact of life on linux. One of the easiest ways to take before/after snapshots and revert if things go poorly. There’s all sorts of technologies out there - timeshift, snapper, etc. but I ended up rolling my own thing. I wanted to: not think about things, and be able to restore easily, without needing to carry around a USB stick all the time.
Bootloader snapshots with systemd-boot-lifeboat
When the computer first turns on, the UEFI runs your initial code to turn the computer on. For linux, that typically includes files on the /efi partition, managed by a bootloader (grub,systemd-boot, refind, etc.)
The important bit is every time you update your kernel (and sometimes in-between), this initial environment is updated. So, if you need to roll back a kernel update, you need to roll back both /root as well as your /efi partition.
I created systemd-boot-lifeboat to keep the last 3 bootloaders around on the system.
For example, my current boot, as well as the previous two times the bootloader changed are automatically stored
ls /efi/loader/entries
arch-fallback.conf arch_signed.conf lifeboat_1664922628_arch_signed.conf lifeboat_1664922766_arch_signed.conf
(Note that this only works if your bootloader is systemd-boot, not e.g. grub)
Root snapshots with lvm-autosnap
I also created lvm-autosnap to handle snapshotting LVM volumes. The idea here is that every boot, during the bootloader (before the root volume is even mounted), the code automatically takes a snapshot.
If the machine fails to reach the desktop more than a few times in a row, then the computer will automatically offer to restore to a previous snapshot. Basically as long as the bootloader is in a good spot, this allows you to restore the system to a working state. (And if the bootloader is bad that’s what I have systemd-boot-lifeboat for)
Conclusion
I have multiple copies of my data that are securely stored (e.g. encrypted) that can address different failure modes.
“Oh no I deleted my document by accident” → restore from the cloud
“Installing this package messed up my system” → restore from a snapshot
“Something really bad happened” → restore to an exact working copy from ~2 weeks ago
My suggestion is to figure out what failure cases you’re worried about, and then design your backup strategy around those cases.