A homeassistant Whisper / Piper docker-compose designed for use with strix halo

ndom91 · January 17, 2026, 3:40pm

I had trouble finding ROCm (especially latest ROCm, i.e. 7+) setups of whisper and/or piper. So I put together a little setup in this repository. Hope it helps others looking to run the HA voice stack locally on their new Framework Desktop (Mainboards)!

ndom91 · January 24, 2026, 12:37pm

Just added support for Qwen3-TTS wrapped in Wyoming as well as Pocket TTS wrapped in wyoming!

Unfortunately qwen3-tts is not generating fast enough to be considered for voice assistant use. It does work, however, specifically with the Qwen3-TTS-12Hz-1.7B-VoiceDesign model which allows designing the voice via a prompt at startup. An audio snippet for a ~10 word sentence took anywhere from 30s-1min to generate.

I think this is mostly down to missing flash-attn in the container. I wasn’t able to get this to work with ROCm 7.1.1 nor 7.2.

Pocket TTS on the other hand works a treat! It is a good bit better than piper in terms of generating natural sounding speech. You can play around with the available voices in their demo page here.

no4b · January 24, 2026, 11:01pm

I made kokoro-fastapi work on Strix Halo if your’re interested in TTS: Add permutation that support running on ROCm gfx1151 devices such as Strix Halo by projects-land · Pull Request #431 · remsky/Kokoro-FastAPI · GitHub

(At the moment when I made it work ROCm 7.2 hadn’t been released yet so it uses nightly. Now it should work with stable 7.2 ROCm.)

ndom91 · January 25, 2026, 3:42pm

Nice one! Any idea if these Strix Halo boxes can generate audio at ~real-time rates via Kokoro?

no4b · January 25, 2026, 9:19pm

It is fast even on CPU. ON GPU it is very fast. Give it a go.

ndom91 · January 27, 2026, 9:27am

I actually had a Kokoro-FastAPI ROCm fork running already, but it turns out it was using the CPU still .

I got your patch running with some minor tweaks (commented in the PR) and wow - its definitely useable for voice assistant real-time use-cases now!

Ahm · January 28, 2026, 4:06am

What would you recommend (in terms of where to start the learning process) to be able to implement this? I am a quick learner but relatively new to running HA voice stack locally…

thanks!

ndom91 · January 28, 2026, 10:38am

Hmm best to probably learn the basics of docker and Linux sys admin stuff first. But if you’re reasonably well versed in those - just dive in! Try to run some models. Ollama and llama.cpp are good starting points

Topic		Replies	Views
Llama.cpp/vLLM Toolboxes for LLM inference on Strix Halo Framework Desktop	57	11584	June 21, 2026
Nanochat on strix halo Framework Desktop ai	0	252	October 21, 2025
Ryzen AI "Max" -- not so much? Framework Desktop	23	4636	December 2, 2025
AMD Strix Halo (Ryzen AI Max+ 395) GPU LLM Performance Tests Framework Desktop ai	17	24069	September 29, 2025
[HOW-TO] Compiling VLLM from source on Strix Halo Framework Desktop ai	59	8174	January 7, 2026

A homeassistant Whisper / Piper docker-compose designed for use with strix halo

Related topics