Ollama with GPU on Linux Framework 13 AMD Ryzen HX 370

Hi,

just want to share how to get Ollama running with iGPU.

Used container

The script uses a container provided by the linked repository. I had quite a similar script to the one of the reddit post, but used not the latest version but the 6.4.1, so dont take this.

Original reddit post

I struggled using ollama with gpu support on my new FW13 with AMD Ryzen HX 370. I stumbled over this reddit post: Reddit - The heart of the internet

Missing permissions on bluefin

I tried to get it to work for bluefin with podman. But I did not have access to /dev/dri folder. I had to add one more

podman run --name ollama \
  -v /var/home/martind/OllamaModels/:/root/.ollama \
  -e OLLAMA_FLASH_ATTENTION=true \
  -e HSA_OVERRIDE_GFX_VERSION="11.0.0" \
  -e OLLAMA_KV_CACHE_TYPE="q8_0" \
  -e OLLAMA_DEBUG=0 \
  --device /dev/kfd \
  --device /dev/dri \
  --security-opt label=type:container_runtime_t \
  -p 127.0.0.1:11434:11434 \
  ghcr.io/rjmalagon/ollama-linux-amd-apu:latest \
  serve

Compared to the reddit post, I had to add this: --security-opt label=type:container_runtime_t

([Fedora Silverblue] Ollama with ROCM - failed to check permission on /dev/kfd: open /dev/kfd: invalid argument - #10 by garrett - Fedora Discussion)

Which Linux distro are you using? Fedora 41 with bluefin-dx:gts

Which kernel are you using? 6.14.6

Which BIOS version are you using? 3.0.3

Which Framework Laptop 13 model are you using? AMD Ryzen™ AI 300 Series

Yep this works for me on Bazzite. I’m the one who posted the quadlet suggestion in the reddit thread too.

Now, if we can just find a container or something to get stable diffusion and text-to-video (or image-to-video) working as well, these “AI” 300 series chips will actually do “AI” (or at least LLMs)

1 Like

What about NPU support ?

1 Like

I am not yet familiar with stable diffusion et al. but would definitely be nice!

How is your experience using the GPU so far? Actually it seems like my CPU is faster for inference. I am using gemma3:4b and get around 21 t/s with CPU while GPU only reaches around 13 t/s. Both are utilized around 70% on individual use.

Btw. in my bios I have set iGPU memory usage to medium, which is around 16GB for me.
Not sure if that is relevant.

Haven’t benchmarked it on CPU vs GPU, but it was acceptable performance for me running through that podman container. Like you, I also have my iGPU memory set to medium, and with my 64gb of system memory, it allocated 16 gigs of it for dedicated vram.

I DID get ComfyUI to run and to go through the GPU with this container
ComfyUI-Docker/rocm/README.adoc at main · YanWenKun/ComfyUI-Docker · GitHub.

After building it according to the instructions on the repo… I ran it with:

podman run -it --rm \
--name comfyui-rocm \
--device=/dev/kfd --device=/dev/dri \
--group-add=video --ipc=host --cap-add=SYS_PTRACE \
--security-opt seccomp=unconfined \
--security-opt label=disable \
-p 8188:8188 \
-v "$(pwd)"/storage:/root \
-e CLI_ARGS="" \
-e HSA_OVERRIDE_GFX_VERSION=11.0.0 \
yanwk/comfyui-boot:rocm

It did spit out a gibberish image when I downloaded one of the models from the in-app model browser, but it ran through the GPU for the whole time, so that’s an improvement. It might have been due to the settings I had for the render though. It’s possible if I tweaked things I’d get a good output.

If other folks want to experiment with this and see if they can get it to output a non-gibberish image using the GPU that’d be cool, since I won’t have time to explore more for a few days at least.

1 Like

No problems running on bazzite and using a fedora distrobox for ROCM. Suggest that distrobox is going to be the easiest way to do this without needing to fiddle anything further.