Maxed out Minimax M2.5 on Framework Desktop

inteks · February 14, 2026, 4:20pm

Hi, got it loaded
Minimax M2.5 on Windows with Llama.cpp Vulkan edition

c:/llama.cpp.vk/llama-server.exe --host 0.0.0.0 --port 8123 --model .lmstudio\models\Unsloth\MiniMax-M2.5\MiniMax-M2.5-UD-Q3_K_XL-00001-of-00004.gguf -c 16384 --keep 1024 --no-mmap --flash-attn on --cache-type-k q4_0 --cache-type-v q4_0 -fit on --context-shift

its the Unsloth UD-Q3_K_XL version with 16k Context and i got 22t/s in Open-WebUI (Python version on Windows) VRAM is 99GB and Ram is 25GB used

dinosaur_rawr · February 15, 2026, 7:07pm

How does it perform and what use cases do you have for it? Thanks!

inteks · February 15, 2026, 7:26pm

It performs significantly better than I expected… but I think since it’s only Q3, you have to expect some errors or even hallucinations. The context will also slow down considerably when it’s running at 42k… I think 16k is realistic though… it’ll be enough for “technical discussions,” but for coding I’ll definitely use Qwen3 Coder Next at q6_k_m … I actually just wanted to test the maximum possible on the Strix Halo.

Lincoln_Chen · February 16, 2026, 6:41am

Did you have to manually set VRAM for this to work? So far I’ve had no luck getting it to load because llama.cpp keeps maxing out at 64gb VRAM.

inteks · February 16, 2026, 6:46am

yes. normaly i have it @ 64gb, for this i had to set it to 96gb. and i was using the rocm Version of llama.cpp binaries before and now downloaded i the vulkan release… it took same time to get it running

Topic		Replies	Views
[TRACKING] Will the AI Max+ 395 (128GB) be able to run gpt-oss-120b? Framework Desktop framework-desktop-ai-max-300 , ai	35	15730	January 25, 2026
AMD Strix Halo (Ryzen AI Max+ 395) GPU LLM Performance Tests Framework Desktop ai	17	21478	September 29, 2025
Llama.cpp/vLLM Toolboxes for LLM inference on Strix Halo Framework Desktop	56	10591	February 2, 2026
Which language models are you using? Framework Desktop	46	2748	March 7, 2026
Ryzen AI "Max" -- not so much? Framework Desktop	23	3748	December 2, 2025

Maxed out Minimax M2.5 on Framework Desktop

Related topics