I was wondering, are there any applications available to utilize the NPU on linux?
And not the kind for developers/tinkerers like running LLMs on it, but something more general user focused, e.g. OCR, Noise reduction, camera processing, presence detection, something that is useful in everyday workflow and would benefit from running a lower powered dedicated hardware.
From my findings i only found ppl trying to run LLMs for some reason and one project using it for TTS (which is what i am asking for).
P.S.: really need the tag “linux” be available for general discussions
1 Like
I know that lemonade is able to run AI models locally on the NPU. Here is a phoronix article on it: https://www.phoronix.com/news/AMD-Ryzen-AI-NPUs-Linux-LLMs
1 Like
Seems like the 1 very useful feature, STT, is not yet implemented with NPU for linux. But thank you for sharing.
P.S.: why is everyone trying to cram a LLM into a NPU, isn’t it massivly underpowered for anything but a novelty?
I actually tried it out. It was pretty decent for a local AI (I have zero frame of reference lol). I had to install lemonade-server on Arch, then follow its instructions to setup fastflowlm and xrt. Then, since linux kernel 7 isn’t out, I had to get a newer amdxdna kernel driver from amdxrna-dkms aur package. Then I could use the built in NPU.
To be honest, I think NPUs r kinda stupid. But, if you’re gonna use local AI, it’s pretty goated. Of course, you can always just use ollama-vulkan.
1 Like
Also noticed that lemonade only supports XDNA2 for now, so tough luck for my gen 1 no matter the OS.
Ye, i understand that the NPUs are very underpowered, but i thought of like having RNNoise analogue that uses the NPU, or having a model that mimics the IR sensors in the new OLED monitors to turn them off/lock but does that using the webcam, or macOS style build-in webcam background removal, etc.
All these applications don’t require high performance, but benefit from being always on and on a laptop that means being very power efficient. Thats why I don’t really see the use of CPU/GPU applicable because of power overhead.
But maybe I am missing a point somewhere.