Qwen Image/Edit and WAN 2.2 for Image/Video Generation

For people looking to run Image and Video models on Strix Halo, I’ve been working on some containers/toolboxes for Qwen Image and WAN 2.2:

I have a video that documents all of this:

let me know how you get along!

7 Likes

For people using this, get the latest version. It’s much faster than what was shown in the videos and way more stable.

What makes the new version faster? And how much faster are we talking about?

If you could share how to set this up without toolbox that would be great! Specifically ROCm and/or other not straight forward installs.

1 Like

It’s quite faster, 1/3 at least. Basically I enabled tiling for the Variational Auto Encoder and that reduces that step from minutes to seconds.

Toolbox is just Docker containers, but if you want to do it without containers, just check the Dockerfiles in the repo: they are basically the step by step instructions on how to set up ROCm/pytorch.

You can expect further improvements in performance when the ROCm pytorch wheels are fixed to include AOTriton - I tested manually and on Qwen Image they are almost 3x times faster.

Right now on my toolbox you can enable Triton by setting a variable, I monkey patched it, but that will make stuff unstable. But when it doesn’t crash… it’s beautifully fast!

Great, I’ll have a look at the docker files and use it to setup my system. Thanks for doing all the pre work :slight_smile: