Qwen Image/Edit and WAN 2.2 for Image/Video Generation

For people looking to run Image and Video models on Strix Halo, I’ve been working on some containers/toolboxes for Qwen Image and WAN 2.2:

I have a video that documents all of this:

let me know how you get along!

15 Likes

For people using this, get the latest version. It’s much faster than what was shown in the videos and way more stable.

What makes the new version faster? And how much faster are we talking about?

If you could share how to set this up without toolbox that would be great! Specifically ROCm and/or other not straight forward installs.

1 Like

It’s quite faster, 1/3 at least. Basically I enabled tiling for the Variational Auto Encoder and that reduces that step from minutes to seconds.

Toolbox is just Docker containers, but if you want to do it without containers, just check the Dockerfiles in the repo: they are basically the step by step instructions on how to set up ROCm/pytorch.

You can expect further improvements in performance when the ROCm pytorch wheels are fixed to include AOTriton - I tested manually and on Qwen Image they are almost 3x times faster.

Right now on my toolbox you can enable Triton by setting a variable, I monkey patched it, but that will make stuff unstable. But when it doesn’t crash… it’s beautifully fast!

1 Like

Great, I’ll have a look at the docker files and use it to setup my system. Thanks for doing all the pre work :slight_smile:

works perfect thank you for your work ! =)

1 Like

Also want to leave a quick thanks here. Your instructions are clear and very helpful! Can’t believe how well image generation works on this little framework desktop machine.

1 Like

Thanks for dropping a comment :folded_hands:

1 Like

Hey in the last few days qwen studio stopped working for me and I can’t figure out why. When I send a prompt over the web ui the process is stuck here:

Loading Qwen-Image-Edit model for image editing…
Loading pipeline components…: 0%| | 0/6 [00:00<?, ?it/s]torch_dtype is deprecated! Use dtype instead!

Because I thought I might have messed up something while developing and fiddeling with my system I completely reinstalled Fedora and the toolbox from scratch only to get the same result. I’m using Fedora 43 with the latest updates. I recently downgraded the kernel from 0.0.3.4 to 0.0.3.3 because of the slow boot bug. But I got the issue in both kernel versions.

I have seen that there are some instructions posted in late november on the github repository of this toolbox concerning kernel and other topics, but I couldn’t make sense of it, because they are beyond my understanding.

Is anybody else having problems with the toolbox lately?

Interesting that you can’t get the image generation to work. I’ve been finding the vibe voice toolbox sticking at the same point of loading the model on that too.

May be something wider than just the image generation toolbox

1 Like

In case it is of anyones interrest (for regular linux tinkerers this is common knowledge):

After some further reading I realised first of all I mixed up bios and kernel updates. Just as a sidenote.

Furthermore, from what I understand the pipeline problem comes from changes through AMD and ROCm patches. The problem has already been fixed in linux kernel versions that are available but not yet part of a stable Fedora built. After reading for a while on how to manually change my linux kernel I decided to just wait. Anyway at least I did not brick anything and can be optimistic that it will work again in the future when keeping everything up to date. It’s something!

1 Like