[Guide] Zero-swap hibernation partitions in Linux via `systemd`

Preface/“What & Why?”

A “zero-swap” hibernation partition, as I’m calling it (is there an established name for this?), is a swap partition dedicated to hibernate & resume services.

I think some distinction is merited since, by default, any sufficiently large and activated (swapon <swap-partition>) swap space permits hibernation. Even kernel parameter vm.swappiness = 0 doesn’t prevent using the swap partition as extra RAM; it just tells the system to avoid doing so until all available RAM is depleted.

That brings us to an example situation for why this might be interesting:

Suppose you’re doing some video encode/trans-code (or something else that exhausts memory, maybe you’re running a couple VMs) in the background and running a browser with a bunch of tabs as you’re reading into something, maybe typing something up too. Your memory usage is near its limit, and you start another program–suddenly, you see the time, get a call, something alerts you, and you’ve got to leave. You close your laptop (which you’ve set up to trigger hibernation, essentially freezing everything in place exactly as it was so you can get back to it later) and toss it in your bag on your way out.

That last program you started before departure caused a spillover from main memory into the swap space you meant to use for hibernation, meaning the total amount of memory in use exceeds that available for hibernation, and the hibernation trigger fails. When you return to work, your battery is dead, and your session is lost. If the swap were somehow reserved for hibernation, that last program would’ve failed to start (of course that depends on the specifics of the system, but you get the idea), and hibernation would’ve succeeded, preserving your session, and letting you know you were pushing the memory envelope.

While not the most common scenario, I’ve gone through it myself on a not-so-great day. So I did a little research and developed a pretty simple solution that I’ll lay out here in case it interests others.

Notice for your time and mental health

This guide assumes you already have a swap partition set up that you can hibernate to.
It doesn’t cover a set up for basic hibernation.
It covers how to restrict the usage of that swap partition to just the hibernate/resume functionality.


The 4 Modifications

There are only 4 modifications to systemd automation services needed to make this work.

0. vm.swappiness (optional)

At the bottom of the file /etc/sysctl.conf, add the line

vm.swappiness=0

I recommend this for the basic set up described here, but you can have it both ways–a regular swap partition or file and a dedicated hibernation partition, but that’s a bit more convoluted, and this guide is aimed at people who feel they have enough main memory, and just want to ensure hibernation works when they need it to.


1. Disable the swap partition during system initialization

Create a new systemd service file in the directory /etc/systemd/system, I’ll refer to it as /etc/systemd/system/hibernation--swapoff_during_initialization.service, and in it put:

[Unit]
Description=swapoff hibernation partition on system initialization
Before=multi-user.target

[Service]
Type=oneshot
ExecStart=/sbin/swapoff -U <hibernation_partition_UUID>

[Install]
RequiredBy=multi-user.target

You can get the UUID of the hibernation partition with the command:

blkid /dev/path/to/partition  # This is the command, below is the UUID
/dev/path/to/partition: UUID="01234567-89ab-cdef-0000-000000000000" TYPE="swap"

So in this case, the file would read:

[Unit]
Description=swapoff hibernation partition on system initialization
Before=multi-user.target

[Service]
Type=oneshot
ExecStart=/sbin/swapoff -U 01234567-89ab-cdef-0000-000000000000

[Install]
RequiredBy=multi-user.target

Then enable the service:

sudo systemctl enable hibernation--swapoff_during_initialization.service

2. Bypass the login daemon’s hibernation attempt filter

By default, systemd-logind will check to see that there is enough active swap space to perform hibernation before allowing a hibernation attempt.

Normally, this is a good thing, but it prevents us from trying to hibernate with “zero” swap space, which is what we’re trying to do–We want the system to believe that there is NO swap space available, EXCEPT when we try to hibernate. Otherwise, it would use the partition like any other swap memory.

To bypass systemd-logind, we need to modify its service file at
/etc/systemd/system/systemd-logind.service,
and add the line

Environment=SYSTEMD_BYPASS_HIBERNATION_MEMORY_CHECK=1

(Thanks to @arvidjaar for pointing out this “big hammer” at Failed to hibernate system via logind: Not enough swap space for hibernation · Issue #15354 · systemd/systemd · GitHub)


3. Re-enable swap on the hibernation partition just prior to hibernation

Create another new systemd service file in the directory /etc/systemd/system, I’ll call this one /etc/systemd/system/hibernation--swapon_before_sleep.service:

[Unit]
Description=swapon hibernation partition before system sleep
Before=sleep.target

[Service]
Type=oneshot
ExecStart=/sbin/swapon -U 01234567-89ab-cdef-0000-000000000000

[Install]
RequiredBy=sleep.target

Then enable it:

sudo systemctl enable hibernation--swapon_before_sleep.service

4. Re-disable swap on the hibernation partition on resume

As mentioned in the same above linked github issue (Failed to hibernate system via logind: Not enough swap space for hibernation · Issue #15354 · systemd/systemd · GitHub), there isn’t a wakeup.target and After=sleep.target is non-functional, similar to After=shutdown.target–after the system shuts down, there’s nothing left to do, and no memory of what should be done–the system is fresh. A similar state of affairs seems to hold for sleep.target, so instead we turn to /usr/lib/systemd/system-sleep, a directory for pre- and post- sleep.target hooks (which trigger on hibernate/resume).

In my testing, the pre-sleep hook here didn’t trigger until after hibernation entry progressed further along, meaning it doesn’t work effectively as the pre-hibernate swapon trigger. That’s why we made the dedicated systemd service file, and there doesn’t appear to be a convenient way to handle resume via service files, so here we are.

Anyway, this configuration file will just handle re-disabling swap when the system resumes from hibernation. In /usr/lib/systemd/system-sleep/hibernation_resume, put:

#!/bin/sh

case ${1} in
  post) swapoff -U 01234567-89ab-cdef-0000-000000000000;;
esac

Reboot and test

With the modifications made and new services enabled, it’s time to reboot and test the system:

reboot

Once you re-login and get to a shell, run

free -h

You should see in the Swap row 0B for total, used, and free.
In my case (I have 32GB installed, 1GB is reserved):

free -h
               total        used        free      shared  buff/cache   available
Mem:            31Gi       2.7Gi        26Gi       431Mi       2.6Gi        28Gi
Swap:             0B          0B          0B

You can also check the lsblk command output to see that your partition doesn’t specify [SWAP] as a mount point. Here’s mine both before and after the reboot for example:

BEFORE:

NAME                                   MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
nvme0n1                                259:0    0 465.8G  0 disk  
├─nvme0n1p1                            259:1    0   512M  0 part  /boot/efi
├─nvme0n1p2                            259:2    0   512M  0 part  /boot
└─nvme0n1p3                            259:3    0 464.8G  0 part  
  └─nvme0n1p3_crypt                    254:0    0 464.7G  0 crypt 
    ├─lvm2_vg-hibernation_swap_space   254:1    0    32G  0 lvm   [SWAP]
    └─lvm2_vg-root_file_system         254:2    0 432.7G  0 lvm   /

AFTER:

NAME                                   MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
nvme0n1                                259:0    0 465.8G  0 disk  
├─nvme0n1p1                            259:1    0   512M  0 part  /boot/efi
├─nvme0n1p2                            259:2    0   512M  0 part  /boot
└─nvme0n1p3                            259:3    0 464.8G  0 part  
  └─nvme0n1p3_crypt                    254:0    0 464.7G  0 crypt 
    ├─lvm2_vg-hibernation_swap_space   254:1    0    32G  0 lvm   
    └─lvm2_vg-root_file_system         254:2    0 432.7G  0 lvm   /

This is to verify the 1st modification, disabling swap during initialization.


Next, do the same after initiating hibernate.

sudo systemctl hibernate

Normally this will write the contents of RAM to your NVM storage and poweroff the machine, unless you have an unusual configuration.

Assuming it powered off, power it back on, go through the boot menu and login as applicable. If your screen and windows look the same as before running hibernate, that’s a good sign everything is working.

Re-run the free -h and lsblk commands. You should see the same thing as before:
0B in the Swap row for total, used, and free, and that [SWAP] isn’t listed as a mount point for the swap partition. If so, then everything went well.

If you have other methods of initiating hibernate set up, like pressing a power button or closing your laptop lid, definitely test those out too and make sure they work.

A good final test for successful hibernation in general, not just a “zero-swap” system, is to start a YouTube video, move to the half-way point in the timeline, then initiate hibernate and power back on. If you’re back at the same place in the video, your set up is working well.

– 0hana

Why not use a swap file instead? Allows you tor resize it a lot easier and the encryption part is handled together with the main fs.

Plus your concern of swap being too filled to hybernate can be much easier remedied by using a slightly bigger swap file/partition by the time you used enough swap of a 1.5x memory size swap file to not be able to hybernate anymore, your system has bogged down to a point of unusability anyway.

1 Like

@Adrian_Joachim
Regarding a swap-file vs swap-partition, I feel they are functionally the same in this example. Indeed, resize is easier with a file, but if it’s for a dedicated hibernation partition, the need to resize is probably infrequent at worst, and I figured those familiar with swap-files could probably do the translation of the guide for their use case, as all they really need (off the top of my head) is the UUID if it is continuous.

That said, depending on how the swap file is created, and the state of the NVM device relative to the file system, it is my understanding that its existence may be fragmented across the device, slowing access time vs. a partition which is effectively contiguous. I can’t remember if that’s the case—do you happen to know?

Regarding handling encryption with the main file system, that’s already handled in the current scheme as the swap partition is an lvm subdivision of the encrypted partition, so there shouldn’t be a difference, no?

On allocating a bigger swap, we kind of circle back to the initial reason for swap’s existence: not enough RAM. It’s trying to stretch a limited resource. Swap space is certainly cheaper and easier to adjust after buying your parts than just getting more RAM, but if you’re really running out of RAM all the time, a bigger swap isn’t a great solution—as you said, it’ll bog the system to a point of unusability, and we still have the hibernate issue if the sum of in use swap and RAM exceed the total swap space, hibernate will fail to trigger.

I think it’s more a matter of priorities. As mentioned toward the beginning of the admittedly long :sweat_smile: exposition, I’ve had a situation where I lost my session. I’m not hurting for RAM on my machine (32GB), but I will use it. What matters more to me is keeping the session intact when I’m interrupted and need to pause my work.

I’ve seen questions about “theoretically” guaranteeing something like hibernate before, and until it was actually a problem for me, I didn’t think much of it. vm.swappiness=0 seemed like enough. But then it happened and it was really frustrating, so I tried to come up with a solution.

I think though, for most people, if you want the simplest, most straightforward solution to an infrequent oom issue, then yes, exactly what you said—just making a bigger swap file is much easier.

This is a pretty niche solution.
But I enjoyed figuring it out :slightly_smiling_face:

I’d also like to point out that, with a solution like this, you can do it with either 2 swap partitions, a swap partition and a swap file, or 2 swap files, and guarantee that one of them works for hibernation of the combined sum of RAM and swap memory.

Only the position of that start matters, the rest is handles by the file system. The start can move in some very rare cases though.

In your case yes, lvm isn’t really a standard use case though and with a swap partioion it’s easy to end up with unencrypted swap which is really bad (it’s pretty much the only argument for using no swap imo).

And there is the fundamental misunderstanding about the swap. It isn’t about not enough memory but using it more efficiently. Even huge systems with terabytes of memory still use swap. In normal use only unused stuff goes into swap which is good, so that ram can actually be used for useful stuff or caching. Once you run out of ram actively used stuff has to be evicted into swap, and that’s when stuff bogs down. Swappyness of 0 isn’t really reccomended but setting it soewhat low helps keeping the drive idle if you have one that takes a long time to get back to sleep.

There are multiple talks about it by people that understand that bit better than me but the big point is swap isn’t just “emergency ram”.

Having fun is definitely a good justification for doing stuff like this.

That is a good point though

The start moving relative to a main partition is a concern for reliability, but yeah, that’s pretty rare. If I understand it correctly, this is when the file-system decides to move things around, but the file entries that reference a block location informing where to hibernate to are not updated (and how would the user know anyway if they didn’t manually check for it?), thus leading to a headache :melting_face:

That completely slipped my mind :flushed:
I’m so used to setting up encryption (including swap) I didn’t even think of it. Yes, especially for a beginner using an installer program who may not realize, I think you’re right, it would be better for them to use a swap file on the main encrypted partition for hibernation–less chance of serious error: it would either be encrypted hibernation or no hibernation at all, the most sane default for an encrypted machine. Very good point :slightly_smiling_face:

You’re absolutely right, swap is not just “emergency ram” :smile: I am familiar with the memory hierarchy and virtual memory pagers.

But humor me for a moment–
suppose you have a choice between a laptop with 32GB of RAM and 32GB of swap,
or a laptop with 64 GB of RAM and 0 swap, but both can hibernate using a secondary 64GB reserved-for-hibernation partition/file scheme like we’ve discussed. All else is equal.
Which would you pick?

Afaik a defragmentation (which you should not do on an ssd anyway, and you really should not have a hdd as your main drive you would hybernate too anyway too XD) is one of the only ways that would cause this. If you were really worried about this you could update it before suspending the same way you turn on your swap.

If you are doing a setup this involved, you can probably figure it out. Otherwise I would not recommend anything other than an oversized encrypted swap partition.

Well obviously the 64gb of ram, though I would pick 64gb of ram and 32gb of swap (or maybe 80 if I could afford it) over that XD. Once you fill up the 64gb with chrome tabs you’ll be glad the never touched stuff of some background programs is in swap so you can have some extra tabs before it bogs down (or oom crashes XD).

Good to know :slight_smile:

True. I meant that more to emphasize that unless you were looking for it, (and I think most people wouldn’t), if or when it did happen, it would cause a massive headache–it would be very frustrating. But, as you said, barring a defragmentation or similar event, it shouldn’t happen. I guess the notion of a swap partition (vs. swap file) makes a lot more sense on HDDs, where occasional defragmentation is more likely :thinking:

I hadn’t thought of it that way before: the system bogging down is a pretty visceral way to convey that memory limits are being reached–no need to look at a memory graph or free command, you feel it immediately, and the slow down might even buy you a little time to adapt to the situation :thinking:

That alone makes having some swap a little more appealing on a spec’d out system (say 80GB+ of RAM) :laughing: especially when combined with the peace of mind from some dedicated hibernation scheme

That’s it, you’ll notice and you can actually recover from it, not like oom killer just thanos snapping random programms with unsafed progress away.

Imo the only good reason not to use swap is on an unencrypted system because swapping to an unencrypted disk can put stuff on a disk that really, really should not be on there. (Or you are on a medium with ridiculously low write endurance, but then you should not hybernate either XD)

Given that the hybernated stuff is getting compressed, even 80gb for 64 is massive overkill but should definitely be enough for some pretty extreme edge cases.