Perfectly reasonable stance to take
One other tip I learned a long time ago and have been using ever since: Install the OS to one partition, create a second partition to hold your real /home, create a third partition for a second “stable” OS, and an optional swap partition if you want that. Then, symlink important folders in home partition to each OS partition’s /home/. So example /home/ghostlegion/Music would symlink to /media/home/ghostlegion/Music on the home partition . And make sure to automount /media/home using /etc/fstab entries. This allows you to reinstall an OS without blowing up your home directory. Having the second stable OS copy means you can fix the bleeding edge OS partition if you screw something up.
The trick is knowing what can be safely symlinked in from both OS’s… Some programs will complain about version mismatches, some config files really shouldn’t be shared. But just basic documents, music, etc are definitely no problem.
I also highly recommend replacing grub with rEFInd for your bootloader, especially in a two OS setup…
i used to do that. personally i think it’s useful if one has only one computer. if one has access to a NAS / backups and an easy way to restore a system, having more partitions is more of a hassle than anything else. the way i do it is symlinking the dotfiles i really need to a folder that i keep in sync across multiple devices (besides of course documents etc.).
besides, if both the root and home partitions are encrypted (which, i mean, yes of course) it easily adds complexity. it is still doable of course, but it gets more annoying.
I’ve done a variation of this where I keep a copy of the live ISO on a dedicated partition and then add an entry to the bootloader for booting the aforementioned ISO. As a side-bonus, it keeps disk space down as well.
It also makes it easy to bug-test things on real hardware in a clean, vanilla configuration, and you’ve also then got the ISO readily on-hand for use in VMs.
Of course, if you’re actually compiling your own installation then this method isn’t exactly practical.
As far as I know, ext4 doesn’t have a native snapshot feature in the same sense that ZFS does. ext4 can use LVM snapshots, but that operates at a lower (block) level, not filesystem. ZFS snapshots are native to the filesystem itself, so you can do things like snapshotting a live filesystem (without freezing the filesystem).
btrfs however does have this support. I’ve always shied away from it due to the rapid development and lack of good recovery tooling when I checked it out. The general saying around it was along the lines of “it worked perfectly until the day it didn’t and it ate my data”. Of course, everyone should have backups, etc etc. But it’s still a hassle.
I will say, none of these filesystems provide self-healing abilities with a single disk, mostly because it would be somewhat pointless (if a drive is exhibiting read errors, it’s likely to only get worse, and parity data can only do so much). At that point I’d consider stability, speed, and tooling, which have traditionally led me to use ext4 or XFS (depending on the device, see here for some more info) for most of my systems, and ZFS for my NAS and some of my workstations.
ZFS does like to eat RAM when configured to, but those are largely performance optimizations to reduce the amount of disk reads necessary to return data. When there’s memory pressure or not much memory in the first place, it’ll simply hit the disk instead of cache. Simple as that (as you’ve noted). I think the bigger strength in ZFS’ snapshots is that they can be sent and synced between devices. So, for example, setting up backups to a NAS can be as simple as creating a snapshot on your machine, then sending it via ssh like this:
LATEST=$(ssh backup-server zfs list -t snapshot -o name -s creation -r backups/workstation | tail -1)
NOW=$(date +"%Y%m%d_%H%M%S")
zfs snapshot -r zpool/home@backup-$NOW
zfs send -R -i $LATEST zpool/home@backup-$NOW | ssh backup-server zfs recv -du backups/workstation
Critical to note here is that this is an incremental backup; no need to send unchanged data. I don’t know that doing something like this is possible with ext4/LVM, at least not without specialized tooling to send block-level incremental diffs.
If this is something that appeals to you, btrfs and ZFS are going to be your best bet. ZFS also has support for filesystem-level encryption and btrfs doesn’t, which you may or may not care about. The biggest win there is interoperability with BSD systems and the fact that you can selectively encrypt subvolumes.
All this to say: there’s good reasons to choose ZFS beyond the traditional parity stuff.
In what way is this different or perhaps better than LUKS? Is it more performant? Or is it because ZFS is aware of encryption and other filesystems are unaware?
How is this better or worse than filesystem level? Again is it related to performance or does it enable greater selectivity in what gets backed up?
It’s different than LUKS in that it’s a native property of the filesystem itself, which allows for doing things like encrypting subvolumes. The performance characteristics vary depending on your workload, really, since ZFS and ext4 also behave differently as filesystems. Worth noting that ext4 apparently does have support for encryption natively through fscrypt. You can do neat stuff like encrypt per-user using a user password, which I think is particularly nice.
As for LVM vs filesystem snapshots, the main difference is that LVM snapshots are basically disk images, which means backing them up incrementally is a bit of a pain and requires special tooling. The other (more noticable, imo) thing is that you can’t browse the snapshots in the same way ZFS and btrfs allow you to. They both let you represent snapshots as a directory on the filesystem, so you can browse them in a Time Machine style directory. This can be useful for grabbing files off an old snapshot in a jiffy… Conversely, those snapshots are limited to the filesystem, so you can’t do something like snapshot multiple filesystems simultaneously (if you have multiple volumes in the LVM volume group).
There are good reasons to use either or. I’d suggest you try some benchmarking yourself with various configurations to figure out what’s best for you. For what it’s worth, I opted for LUKS + LVM + ext4, as I wanted to set up TPM-based unlocking and support encrypted swap, so LUKS + LVM made the most sense. I want to use a vanilla kernel, so that restricts me to ext4, XFS, and btrfs.
Pretty much all of the features I’ve mentioned can be set up using any of these methods (even the integrity checking can be done at the LVM level using dm-integrity). It’s really about what works best for your use case and how easy it is to set up for your distro.
@reanimus Ok, follow-up question time. It seems to me that it should be possible to use LUKS for encryption and follow this guide on setting up FDE, TPM, and Secure Boot while using BTRFS for snapshots/backup. If BTRFS can detect data errors, even if it can’t correct them, I should be able to navigate to the backup and restore the file to an uncorrupted state. Does this sound right to you?
If the backup is not on the same disk where the failure occurred, yes.
Given the way snapshots are implemented, if a given file hasn’t changed, the snapshot doesn’t store a separate copy of it. That is to say, if the file that got corrupted is a file that hasn’t changed in a while, the snapshot’s copy will also be corrupt. If it’s a file that changes often, the snapshot will point to a different file and may be recoverable.
That being said, errors are usually due to one of two things: bit rot or drive failure. Bit rot is a one-off but incredibly rare in drives that are regularly used, whereas drive failure generally means there are going to be more errors on the way. If btrfs is indicating an error, it almost always means the drive is going to fail soon and needs to be replaced.
this is a crucial point. i just want to stress again that relying on the filesystem for “data integrity” on a laptop is a terrible idea the priority should be 1. backups 2. (many other things e.g. don’t get it stolen, don’t leave it in places, don’t sit on it or put it in an oven, etc.) 3. use a filesystem with snapshots.
@marco As I stated in the first few posts, I’ll have an eGPU dock that also has 3.5" drive bay so I’ll be backing up regularly to an external drive. I’m not really that ignorant.
Honestly, the bigger utility of local filesystem snapshots is to protect you from user/software error, rather than hard drive failure. With snapshots of the root filesystem in place, you can apply software updates or try config changes without worrying about how to undo it; simply try your update/changes, and if it breaks, roll back to the snapshot.
Similarly with home snapshots, you can retrieve files you’ve accidentally deleted/corrupted.
@GhostLegion it’s not about ignorance. like @reanimus is saying, snapshots serve a very different purpose from backups
To be fair, I think it’s also common to conflate the two if only because Time Machine is one of the most well-known and widely deployed backup solutions and it offers both snapshotting and backup functionality. But yes, the snapshotting and the backup serve different purposes. Snapshots protect you from user error, software bugs, and general hardware failure (i.e. power loss, CPU reset, etc). Backups protect you from loss of storage (i.e. hard drive dies, virus eats your drive, or you just plain lose it).
Since you and @reanimus had to explain the difference, I suppose it is. Consider me more educated now. I’ll try to avoid conflating the terms in the future. I want protection from ransomware as well as the benefits previously mentioned.
Snapshots should (in most cases) provide ransomware protection, assuming the ransomware doesn’t take the effort of wiping snapshots, as it would be similar to rolling back a config/corruption change.
Backups provide protection in either scenario, as you could simply wipe the drive and restore from a backup, though some ransomware may be programmed to wipe network mounts and the like as well.
i mostly meant FS-based snapshots as opposed to the generic concept of snapshots. my point is about the false sense of security that btrfs might give over backups, as yes, probably ransomwares and the like are a good reason to use btrfs, but they don’t affect a lot of people, while a lot of people experience hardware failures or thefts.
Has there been a successful multi-system ransomware attack in the Linux ecosystem? Last time I looked there had not been, but maybe someone knows differently…
Do you mean something like LockBit Linux-ESXi?
Or do you mean publicly known attack incidents?
EDIT: @GhostLegion don’t listen to me–I’m a nincompoop.
On top of what others have pointed out, I just wanted to mention that AFAIK the integrity-checking powers of BTRFS/ZFS rely on these filesystems having direct access to the bare metal of the drive. In other words, adding a LUKS layer of encryption between the filesystem and the drive effectively neuters the bitrot detection.
Nonetheless, I’m using LVM+LUKS+Btrfs with encrypted root/boot/swap and hibernation. Btrfs snapshotting is a lot more flexible and elegant than LVM snapshots, IMO.