FW16 and linux hugepages

Which Linux distro are you using?
Ubuntu 24.04.2
Which release version?
24.04.02
Which kernel are you using?
6.13.0

Which BIOS version are you using?
03.05

Which Framework Laptop 16 model are you using? (AMD Ryzen™ 7040 Series)
FW16 AMD

Hi,

This is just a word of warning when experimenting with Linux huge pages.
I have 64GB RAM in a FW16.
setting this setting causes Linux to fail to boot the next time you try to boot.
/etc/sysctl.d/10-hugepages.conf:
vm.nr_hugepages = 102400

Recovering from this problem is difficult, so that is why I am raising it as a warning to others.

That seems like a massive number. What website was telling you to do that and not test it before making it permanent?

I can’t imagine it being hard to fix though…just boot your live usb, mount the drive, and remove that file.

2 Likes

Yes, it is a huge number.
The problem is, with a running system, and one tries to set the hugepages to 102400. A smaller value is actually set of 27973, instead of the 102400 one asked for, but it does it silently, not reporting any errors, so it goes unnoticed.
So one might happily think one has managed to set 102400 and its works OK.
But, one then sets the 102400 as the perm value and it fails to reboot.
The failed reboot is actually just a very very slow boot so its not immediately apparent why it is failing to boot.
The recovery is difficult because its not obvious what setting caused the problem.

You asked it to create 100GB pages when you only have 64GB RAM. This simply is not possible. At best, it will attempt to use swap space to supplement your RAM. At worst, it will fail to boot and lock up or drop to emergency shell.

Huge pages can only be created in contiguous/unfragmented memory (kinda the whole point). So, when ran after boot is complete with -w, sysctl will allocate the largest chunk of RAM that isn’t fragmented at the time. In your example, only 27,973MB was unfragmented and available at the moment sysctl attempted to allocate the first huge page. Multiple trials will yield seemingly random values based on that fact; the sooner after boot you manually run sysctl -w vm.nr_hugepages=102400, the larger an unfragmented chunk it will be able to find and allocate to a huge page. However, you will likely also notice a drop in performance for anything that can’t explicitly take advantage of the huge page you just created (there is HugeTLBfs, but I digress) and have now left only [in your example] 37,562MB RAM for your other programs to utilize, increasing the chance of swap usage and slower performance.

However, I don’t think the kernel can handle this dynamic value selection properly during the initial boot process because of how early it occurs and how empty the RAM is. So, it tried to grant your request and allocate 100GB of your 64GB RAM, which is not actually possible…it should eventually fail out to an emergency shell after a while, having been unable to complete a normal boot based on your kernel parameters. Unless you happen to have a 40GB swap file/partition it can creep into, at which point you will get a super slow boot…eventually.

I have a hunch on where you got that 102400 value from, btw, as it was originally discussed in regards to running large Oracle DBs over on their forum. :wink: Unless you are running something that relies on large datasets that could benefit from a little less fragmentation in RAM from a performance standpoint (like a huge Oracle DB or LLM running on the APU), you probably don’t need to go messing with huge pages in the first place. I’ve only seen them yield any real world performance benefit on servers/hypervisors with a TB+ of RAM.

I hope that clears things up!

1 Like