Llama.cpp/vLLM Toolboxes for LLM inference on Strix Halo

Lars_Urban · September 28, 2025, 7:53am

Thank you very much @kyuz0 !

I miss something to fully see the whole resources.
Setup is done with:

sudo grubby --update-kernel=ALL --args=‘amd_iommu=off amdgpu.gttsize=131072 ttm.pages_limit=33554432

i also just reserved 512mb dedicated vram in bios.

but executing:

llama-cli --list-devices

shows only 86GB:

maintenance@fedora:~$ toolbox enter llama-vulkan-radv
⬢ [maintenance@toolbx ~]$ llama-cli --list-devices
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
Available devices:
Vulkan0: Radeon 8060S Graphics (RADV GFX1151) (87722 MiB, 86599 MiB free)

amdgpu_top:

some little tiny bit is missing …
Because i see the same resources in LM Studio (84GB as example)

means has nothing to do with the setup toolbox or LM Studio.
Fedora 42 specific ? O.o

KR Lars

Topic		Replies	Views
AMD Strix Halo Llama.cpp Installation Guide for Fedora 42 Framework Desktop framework-desktop-ai-max-300 , ai	18	6728	January 14, 2026
[HOW-TO] Compiling VLLM from source on Strix Halo Framework Desktop ai	59	5881	January 7, 2026
AMD Strix Halo (Ryzen AI Max+ 395) GPU LLM Performance Tests Framework Desktop ai	17	16863	September 29, 2025
Which language models are you using? Framework Desktop	46	1933	March 7, 2026
[TRACKING] Will the AI Max+ 395 (128GB) be able to run gpt-oss-120b? Framework Desktop framework-desktop-ai-max-300 , ai	35	13884	January 25, 2026

Llama.cpp/vLLM Toolboxes for LLM inference on Strix Halo

Related topics