Stable diffusion running on the Framework Laptop using opencl

Kieran_Levin · November 4, 2022, 8:03pm

Something fun, but stable diffusion runs on the framework laptop. It can generate an image in about 90 seconds. Since Xe graphics can run from system memory, it can load the model without any issues.

Quick guide using Fedora.
Install intel-opencl
dnf install intel-opencl
Download and setup an environment for tinygrad
git clone git@github.com:geohot/tinygrad.git
cd tinygrad
python3 -m venv venv-tinygrad
source venv-tinygrad/bin/activate
python3 setup.py develop
pip install tqdm pyopencl

Grab the model for stable-diffusion:
Go to CompVis/stable-diffusion-v-1-4-original · Hugging Face and make an account, then agree to the terms so you can download the model. Then download the model from https://huggingface.co/CompVis/stable-diffusion-v-1-4-original/blob/main/sd-v1-4.ckpt and copy it to weights/ directory

Generate something interesting
OPT=2 OPENCL=1 time python3 examples/stable_diffusion.py --phrase "trees in the sunset with mountians and elephants" --out art.png

If you install intel_gpu_top you can see that the processing is done on the GPU! And is much faster than the CPU.

agizmo · November 6, 2022, 11:27pm

That’s cool to see. I need to give it a shot.

I’ve been playing with SD, but running it on the CPU. My i7-1260p averages 6.70s/it or 5.5ish minutes for a 50 iteration image. Using an ONNX version of SD through Diffusers is a little faster at 6.00s/it or 5 minutes for 50 iterations.

ConfuSomu · November 7, 2022, 2:56am

That’s way better than running Stable Diffusion on the CPU. Thanks for sharing!

Michael_Wu · February 4, 2023, 4:45pm

Thanks for the writeup!

I was trying to get Elixir’s Bumblebee/NX libraries (some info / more info) working on my i7-1165G7. CPU works, but I haven’t gotten Intel Xe GPU acceleration working yet. I think it’s possible with the Torchx backend, but I gave up for now due to PyTorch compilation issues for OpenCL or Vulkan on Fedora. If I do get it working later, I’ll try to update this post.

Just tried out Stable Diffusion through tinygrad yesterday and intel_gpu_top showed that it wasn’t being used. The flag is now simply GPU=1 which I can confirm works currently.

Mapleleaf · February 4, 2023, 5:45pm

What would be the best package to use for Opencl on Arch?
If I look at the GPGPU Arch Wiki page there seems to be two competing possibilities:

Which one to choose, for the Framework laptop?

Michael_Wu · February 4, 2023, 7:16pm

@Mapleleaf taking a quick look at that Arch Wiki page, it seems like intel-opencl is

deprecated by Intel in favour of NEO OpenCL driver

So in theory, intel-compute-runtime

a.k.a. the Neo OpenCL runtime

should be the one to use. The Framework currently has 11th and 12th gen Intel.

However, the only way to know for sure in regards to performance is to test both of them, e.g. test the average processing time with one vs. the other.

Edit: confirming that intel-compute-runtime is working on Fedora. Thanks for the catch! If I benchmark both of them, I’ll add to this post. If anyone’s interested beforehand, ping me and I’ll run through it.

Mapleleaf · February 4, 2023, 7:45pm

Thanks a lot!

But it says also that intel-opencl-runtime is

the implementation for Intel Core and Xeon processors

So, as i7 intel CPUs are intel Core, I thought it may apply!

Michael_Wu · February 4, 2023, 8:19pm

Ah I see, actually I mistakenly just read the section for intel-opencl and not the section for intel-opencl-runtime.

Though I just double checked and am still certain that intel-compute-runtime is the latest since:

It’s what’s on Intel’s GitHub and is being actively maintained (with a commit 11 hours ago). Under the Supported Platforms section:
GitHub - intel/compute-runtime: Intel® Graphics Compute Runtime for oneAPI Level Zero and OpenCL™ Driver

Intel Core Processors with Gen12 graphics devices (formerly Tiger Lake, Rocket Lake, Alder Lake)

(11th Gen Intel Core is Tiger Lake and 12th Gen Intel Core is Alder Lake)
The last update for intel-opencl-runtime on AUR is 2021-05-08 whereas intel-compute-runtime is 2023-01-18

Happy to help!

Mapleleaf · February 4, 2023, 8:22pm

@Michael_Wu Many thanks!!
Can’t wait to try this out!

Mapleleaf · February 4, 2023, 9:01pm

Somehow it really wants me to install “pycuda”, and then when I launch stable_diffusion.py it complains:

pycuda._driver.RuntimeError: cuInit failed: no CUDA-capable device is detected

I wonder if the env venv-tinygrad interacts badly with my previous install of various pytorch and all optimized for CUDA…

Currently the eGPU is disconnected, so I had expected that it would only rely on the intel GPU instead.

Michael_Wu · February 5, 2023, 1:35am

@Mapleleaf hm, it seems like it’s trying to use pycuda and perhaps also to your disconnected eGPU. Though it shouldn’t provided you’re in a clean Python virtual environment (I very well could be wrong as I’m pretty new to pytorch/cuda/SD/etc.).

With Python 3.10, here’s what my pip list shows after creating a fresh virtual environment, sourcing it, and then running

python3 setup.py develop
pip install tqdm pyopencl

pip list, click to expand:

(venv-tinygrad) mwu@framedora ~/D/g/tinygrad > pip list
Package            Version   Editable project location
------------------ --------- --------------------------------------
certifi            2022.12.7
charset-normalizer 3.0.1
idna               3.4
networkx           3.0
numpy              1.24.1
Pillow             9.4.0
pip                22.3.1
platformdirs       2.6.2
pyopencl           2022.3.1
pytools            2022.1.14
requests           2.28.2
setuptools         65.5.0
tinygrad           0.4.0     /home/mwu/Documents/git-repos/tinygrad
tqdm               4.64.1
typing_extensions  4.4.0
urllib3            1.26.14

Note that I don’t have pycuda installed (and if I try to, it errors).

When I run stable-diffusion.py, these initial warnings appear:

ops_llvm not available No module named 'llvmlite'
ops_torch not available No module named 'torch'
ops_triton not available No module named 'torch'

but it still ends up working. If it seems like it’s stuck on that warning, maybe give a minute – it seems to take a little while to start.

Sidenote:
llvmlite installs on Python 3.10, but isn’t supported on 3.11 at the moment.

If I install torch, the ops_torch and ops_triton warnings turn into:

ops_triton not available No module named 'pycuda'

And pycuda fails to install on my machine

I don’t know and haven’t investigated the impact each has on performance, etc.

Mapleleaf · February 5, 2023, 11:13am

Nice! That was it! It works now.
I should really try to learn to say “NO” to my machine lol ^^

~~However, now another issue: the progress bar seems to be stuck at 0, the ETA is “?”, and intel_gpu_top reports only a very small activity (<1%).~~
Ah! it was there that the GPU=1 comes in!! That solves it!

Michael_Wu · February 5, 2023, 5:15pm

Hahah learning to say no, that’s something I’m working on

Glad you got it working!