Qwen Image/Edit and WAN 2.2 for Image/Video Generation

kyuz0 · September 4, 2025, 2:23pm

For people looking to run Image and Video models on Strix Halo, I’ve been working on some containers/toolboxes for Qwen Image and WAN 2.2:

I have a video that documents all of this:

let me know how you get along!

kyuz0 · September 10, 2025, 4:53pm

For people using this, get the latest version. It’s much faster than what was shown in the videos and way more stable.

Nieles · September 10, 2025, 5:34pm

What makes the new version faster? And how much faster are we talking about?

If you could share how to set this up without toolbox that would be great! Specifically ROCm and/or other not straight forward installs.

kyuz0 · September 10, 2025, 9:37pm

It’s quite faster, 1/3 at least. Basically I enabled tiling for the Variational Auto Encoder and that reduces that step from minutes to seconds.

Toolbox is just Docker containers, but if you want to do it without containers, just check the Dockerfiles in the repo: they are basically the step by step instructions on how to set up ROCm/pytorch.

You can expect further improvements in performance when the ROCm pytorch wheels are fixed to include AOTriton - I tested manually and on Qwen Image they are almost 3x times faster.

Right now on my toolbox you can enable Triton by setting a variable, I monkey patched it, but that will make stuff unstable. But when it doesn’t crash… it’s beautifully fast!

Nieles · September 11, 2025, 4:31am

Great, I’ll have a look at the docker files and use it to setup my system. Thanks for doing all the pre work

Topic		Replies	Views
Llama.cpp/vLLM Toolboxes for LLM inference on Strix Halo Framework Desktop	12	901	September 28, 2025
FireFox Hardware Video Decoding Framework Laptop 13	19	8488	May 10, 2022
Transcode using ffmpeg With AMD dGPU Linux arch	18	6162	May 28, 2024
Stable Diffusion / ROCm / PyTorch Setup Linux ubuntu	17	7430	February 14, 2025
[RESPONDED] Video decoding acceleration in Linux Linux	25	12597	June 20, 2023

Qwen Image/Edit and WAN 2.2 for Image/Video Generation

Related topics