[GUIDE] Use NPU (XDNA2) with Arch Linux and FastFlowLM!

GreyXor · February 25, 2026, 10:10am

Good News, Everyone!
We can now use the NPU for LLM inference with FastFlowLM. Here’s my little guide for you to achieve it

To use the NPU on Arch Linux you need some patches that will soon be merged into linux, xrt-plugin-amdxdna, and xrt.

You need to :

Build linux-git kernel package with patches
~~2. Build xrt and xrt-plugin-amdxdna with patches~~
~~3. Compile FastFlowLM or use my aur package~~

1. Build linux-git with patches
Usin this AUR package AUR (en) - linux-git you have to change the URL:

$_srcname::git+https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux

become

$_srcname::git+https://gitlab.freedesktop.org/drm/misc/kernel/#branch=drm-misc-fixes

Then makepkg -si

~~2. Build xrt and xrt-plugin-amdxdna with patches~~

~~ https://gitlab.archlinux.org/superm1/xrt/-/tree/update-version~~
~~https://gitlab.archlinux.org/superm1/xrt-plugin-amdxdna/-/tree/update-version~~

~~On each repo you have to execute makepkg -si~~

~~3. Compile FastFlowLM~~
~~Follow this link to build FastFlowLM~~
~~

You can also use my aur package https://aur.archlinux.org/packages/fastflowlm-git~~

You may also need to edit /etc/systemd/system.conf and change #DefaultLimitMEMLOCK to DefaultLimitMEMLOCK=infinity

Happy inference!

Special thanks to @Mario_Limonciello

SneakyPM · February 25, 2026, 1:06pm

Very interesting. Can any GGUF model from HuggingFace be used ?

How does it compare to iGPU performance wise tg and pp ?

Can we run iGPU+NPU with NPU for prompt processing and iGPU for text generation ?

GreyXor · February 25, 2026, 1:12pm

No, they need to adept model to make them work with NPU. List of available model’s here: Models · FastFlowLM they also making tools to automate conversion of GGUF to NPU-compatible
There’s some benchmarks here: Benchmarks · FastFlowLM
Not with FastFlowLM which is NPU-only, at least for now.

Shiroudan · February 28, 2026, 6:17pm

I suppose this will not work on the AMD 7040 mainboards considering the repo seems to require XDNA2?

Thank you for the detailed guide however!

George_Sofianos · February 28, 2026, 6:50pm

For people that don’t want to build the Linux kernel, I’ve pushed a dkms package that should work with kernel 6.18+. Here is me testing the NPU in Lemonade

John_Scott1 · March 1, 2026, 7:08am

This is awesome stuff guys, thanks for sharing. For Ubuntu, AMD now have a PPA for the xdna2 DKMS package. There is some more info at FastFlowLM/docs/linux-getting-started.md at main · FastFlowLM/FastFlowLM · GitHub - that doc refers to a .deb package for fastflowlm which doesn’t exist yet but I’m sure it’s only a matter of time.

I was able to build FastFlowLM from source and after a small amount of fiddling around (had to make a symlink from my build directory to src/xclbins) now have gpt-oss-20b running at 18 tok/s via the NPU, even with my CPU in power-saving mode. Very cool.

Guest383 · March 10, 2026, 4:08am

Thank you. I had to install your amdxdna-dkms package to get it to work on CachyOS because Cachy’s firmware was newer than the driver’s version or something. PSA: Do not set amd_iommu=off as a kernel boot parameter if you want the NPU to work.

Caleb_Maclennan · March 22, 2026, 7:28am

I think you can scratch off step 3 from the list too, this seems to work out of the box with the fastflowlm version packaged in stable Arch repos now: 0.9.35. I got it to work without -git package.

For reference I also didn’t use the linux-git kernel, I use the linux-mainline AUR package that is 7.0.0-rc4.

GreyXor · March 22, 2026, 10:02am

Sure thing, thanks!
I will also update it completely when Linux 7.0 will be released(~2026-04-19)

Topic		Replies	Views
AMD-specific Ollama Alternative? Framework Desktop	8	3246	August 12, 2025
Anyone utilized the NPU? Framework Desktop ai	11	1849	February 27, 2026
Status of AMD NPU Support Linux	34	14384	February 28, 2026
Framework 13 + Ryzen AI + Linux Distro + LLM Linux ubuntu , fedora	20	4409	February 11, 2026
Linux documentation to run Ollama or Llamacpp or vLLM? Linux ubuntu	10	2388	October 30, 2025

[GUIDE] Use NPU (XDNA2) with Arch Linux and FastFlowLM!

Related topics