Ollama with GPU on Linux Framework 13 AMD Ryzen HX 370

martind · June 9, 2025, 1:49pm

Hi,

just want to share how to get Ollama running with iGPU.

Used container

The script uses a container provided by the linked repository. I had quite a similar script to the one of the reddit post, but used not the latest version but the 6.4.1, so dont take this.

Original reddit post

I struggled using ollama with gpu support on my new FW13 with AMD Ryzen HX 370. I stumbled over this reddit post: Reddit - The heart of the internet

Missing permissions on bluefin

I tried to get it to work for bluefin with podman. But I did not have access to /dev/dri folder. I had to add one more

podman run --name ollama \
  -v /var/home/martind/OllamaModels/:/root/.ollama \
  -e OLLAMA_FLASH_ATTENTION=true \
  -e HSA_OVERRIDE_GFX_VERSION="11.0.0" \
  -e OLLAMA_KV_CACHE_TYPE="q8_0" \
  -e OLLAMA_DEBUG=0 \
  --device /dev/kfd \
  --device /dev/dri \
  --security-opt label=type:container_runtime_t \
  -p 127.0.0.1:11434:11434 \
  ghcr.io/rjmalagon/ollama-linux-amd-apu:latest \
  serve

Compared to the reddit post, I had to add this: --security-opt label=type:container_runtime_t

([Fedora Silverblue] Ollama with ROCM - failed to check permission on /dev/kfd: open /dev/kfd: invalid argument - #10 by garrett - Fedora Discussion)

Which Linux distro are you using? Fedora 41 with bluefin-dx:gts

Which kernel are you using? 6.14.6

Which BIOS version are you using? 3.0.3

Which Framework Laptop 13 model are you using? AMD Ryzen™ AI 300 Series

Nitrousoxide · June 9, 2025, 2:48pm

Yep this works for me on Bazzite. I’m the one who posted the quadlet suggestion in the reddit thread too.

Now, if we can just find a container or something to get stable diffusion and text-to-video (or image-to-video) working as well, these “AI” 300 series chips will actually do “AI” (or at least LLMs)

GreyXor · June 9, 2025, 6:58pm

What about NPU support ?

martind · June 11, 2025, 8:47pm

I am not yet familiar with stable diffusion et al. but would definitely be nice!

How is your experience using the GPU so far? Actually it seems like my CPU is faster for inference. I am using gemma3:4b and get around 21 t/s with CPU while GPU only reaches around 13 t/s. Both are utilized around 70% on individual use.

Btw. in my bios I have set iGPU memory usage to medium, which is around 16GB for me.
Not sure if that is relevant.

CPU	GPU
image424×534 21.3 KB	image424×534 21.7 KB

Nitrousoxide · June 11, 2025, 9:10pm

Haven’t benchmarked it on CPU vs GPU, but it was acceptable performance for me running through that podman container. Like you, I also have my iGPU memory set to medium, and with my 64gb of system memory, it allocated 16 gigs of it for dedicated vram.

I DID get ComfyUI to run and to go through the GPU with this container
ComfyUI-Docker/rocm/README.adoc at main · YanWenKun/ComfyUI-Docker · GitHub.

After building it according to the instructions on the repo… I ran it with:

podman run -it --rm \
--name comfyui-rocm \
--device=/dev/kfd --device=/dev/dri \
--group-add=video --ipc=host --cap-add=SYS_PTRACE \
--security-opt seccomp=unconfined \
--security-opt label=disable \
-p 8188:8188 \
-v "$(pwd)"/storage:/root \
-e CLI_ARGS="" \
-e HSA_OVERRIDE_GFX_VERSION=11.0.0 \
yanwk/comfyui-boot:rocm

It did spit out a gibberish image when I downloaded one of the models from the in-app model browser, but it ran through the GPU for the whole time, so that’s an improvement. It might have been due to the settings I had for the render though. It’s possible if I tweaked things I’d get a good output.

If other folks want to experiment with this and see if they can get it to output a non-gibberish image using the GPU that’d be cool, since I won’t have time to explore more for a few days at least.

jwp · June 11, 2025, 10:39pm

No problems running on bazzite and using a fedora distrobox for ROCM. Suggest that distrobox is going to be the easiest way to do this without needing to fiddle anything further.

martind · June 24, 2025, 8:52pm

Hi,

ComfiyUI issues

I tried both approaches. But both times when I run the tutorial model of ComfyUI its process is killed with

  0%|                                                    | 0/20 [00:00<?, ?it/s]
:0:rocdevice.cpp            :2993: 6067000492 us:  
Callback: Queue 0x7f57f0600000 aborting with error : 
HSA_STATUS_ERROR_INVALID_ISA: The instruction set architecture is invalid. code: 0x100f

Did you try the tutorial model? (The first one that is visible when opening comfyui for the first time)
I am wondering if it has anything to do with the model i had to download. I am new to stable diffusion models, not sure what is recommendable, yet.

Lemonade SDK

Btw. I also tried lemonade SDK via ubuntu toolbox, runs smoothly. Unfortnately I was not able to benchmark t/s via open web ui. But I had a bit the feeling the LLM was a tiny bit faster.
I have set up ubuntu-toolbox and followed instructions on this site (Linux llama.cpp):

In my toolbox-terminal i installed miniforge as described. Used the commands on the website and afterwards started via:

lemonade-server-dev serve

(if that does not work, maybe consider export this variable: export HSA_OVERRIDE_GFX_VERSION=11.0.0)
then i could access it via localhost:8000 I think. And downloaded a model. It is also easy to integrate in open webui via open api connector.

I am just not sure if it utilizes the NPU, because I am missing a tool to monitor it.

martind · June 24, 2025, 9:43pm

Ok to get rid of my issue for stable diffusion I have to use the argument --force-fp32 since float16 does not seem to work. When running in toolbox i have to set it via python main.py --force-fp32 I guess for the podman version it would be via this line: -e CLI_ARGS="--force-fp32" \ but i havent tried yet.

aquarat · June 27, 2025, 11:04am

I just wanted to say thank you for your post. I used your container to get Ollama running. I have Framework Desktops on order but I got this running on a non-Framework machine (which is also a Strix Halo 395). I’d ideally like to get Ollama (or vLLM) running across 4 or 6 Strix Halo machines, but that’s another problem

martind · June 27, 2025, 11:34am

Credit goes mainly to @Nitrousoxide
Happy toe hear about your experience with the 395! Also ordered one

Topic		Replies	Views
Ollama on Framework 13 Linux bazzite	9	541	August 4, 2025
Ollama - Framework 13 AMD? Framework Laptop 13	3	4027	June 30, 2024
CashyOS (Arch) ollama / docker iGPU recognition Framework Desktop framework-desktop-ai-max-300 , ai	3	215	September 17, 2025
Ollama on Framework 16 with dGPU Framework Laptop 16 framework-laptop-16-amd-7040	2	640	June 8, 2024
LLM Benchmark (AMD 7840u) Linux	9	3359	January 5, 2025