AMD Strix Halo Llama.cpp Installation Guide for Fedora 42

Germain_Perez · November 10, 2025, 12:51am

Hi. Thanks for the guide! I got lost in the forest on my first try but got Qwen3 Q8 XL running on my 64GB desktop on the second. I wrote down my steps (some of which are unique to my setup) and want to share here:

Install from USB boot loader, select in BIOS. See Framework website for Fedora 43 install ( Fedora 43 Installation on the Framework Desktop - Framework Guides )
From ( linux-docs/framework-desktop/Fedora-all.md at main · FrameworkComputer/linux-docs · GitHub ) : $ sudo dnf upgrade (then reboot)
Install llama.cpp on Fedora ( AMD Strix Halo Llama.cpp Installation Guide for Fedora 42 )
1. $ sudo grubby --update-kernel=ALL --args='amd_iommu=off amdgpu.gttsize=49152 ttm.pages_limit=12288000’ (for 64gb ram, search google for ttm.pages_limit calc)
  1. Verify: $ sudo grubby --info=ALL | grep args
  2. Reboot
  3. Verify after reboot: $ cat /proc/cmdline
2. The BIOS setting for allocated iGPU should be default, 512MB (0.5 GB) minimum
3. Check if toolbox installed: $ toolbox —version
4. Add user to GPU groups:
  1. $ sudo user mod -aG video $USER
  2. $ sudo user mod -aG render $USER
5. Choose and create a toolbox:
  1. Create some boxed backends, ie:
    1. $ toolbox create llama-rocm-6.4.4-rocwmma \ --image docker.io/kyuz0/amd-strix-halo-toolboxes:rocm-6.4.4-rocwmma \ – --device /dev/dri --device /dev/kfd \ --group-add video --group-add render --group-add sudo --security-opt seccomp=unconfined
    2. $ toolbox create llama-vulkan-radv \ --image docker.io/kyuz0/amd-strix-halo-toolboxes:vulkan-radv \ – --device /dev/dri --group-add video --security-opt seccomp=unconfined
6. Enter the toolbox: $ toolbox enter llama-rocm-6.4.4-rocwmma
  1. Inside toolbox, verify: $ llama-cli —list-devices
  2. ‘exit’ the toolbox
7. Download a model:
  1. Create a models dir: $ mkdir -p ~/Development/ai/models
  2. Install pip: $ sudo dnf install -y python3-pip
  3. Install hugging face-cli: $ pip install --user “huggingface_hub[hf_transfer]”
  4. Make sure ~/.local/bin is in your PATH:
    1. $ echo ‘export PATH=“$HOME/.local/bin:$PATH”’ >> ~/.bashrc
    2. $ source ~/.bashrc
  5. Actually download the model, ie
    1. $ HF_HUB_ENABLE_HF_TRANSFER=0 huggingface-cli download unsloth/Qwen3-30B-A3B-GGUF \ Qwen3-30B-A3B-UD-Q8_K_XL.gguf \ --local-dir Development/ai/models/qwen3-30B-A3B-Q8_K_XL/
8. Run the model
  1. $ toolbox enter llama-rocm-6.4.4-rocwmma
  2. $ llama-cli --no-mmap -ngl 999 \ -m ~/Development/ai/models/qwen3-30B-A3B-Q8_K_XL/Qwen3-30B-A3B-UD-Q8_K_XL.gguf
  3. ‘exit’ toolbox when done
9. Should you want to return memory allocations to their defaults (like to play games or use other memory intensive apps?):
  1. $ sudo grubby --update-kernel=ALL --remove-args=‘amd_iommu=off amdgpu.gttsize ttm.pages_limit’
  2. Then reboot. To go back to using ai models, run step (3.1) again. Can go back-n-forth

Topic		Replies	Views
Llama.cpp/vLLM Toolboxes for LLM inference on Strix Halo Framework Desktop	56	10653	February 2, 2026
AMD Strix Halo (Ryzen AI Max+ 395) GPU LLM Performance Tests Framework Desktop ai	17	21619	September 29, 2025
Linux + ROCm: January 2026 Stable Configurations Update Linux fedora	26	3705	February 23, 2026
Updated commands to increase max unified memory usage on Framework Desktop under Fedora 43? Framework Desktop framework-desktop-ai-max-300 , ai	24	4759	March 14, 2026
[HOW-TO] Compiling VLLM from source on Strix Halo Framework Desktop ai	59	7519	January 7, 2026

AMD Strix Halo Llama.cpp Installation Guide for Fedora 42

Related topics