Anybody tried running image generation (e.g. Stable Diffusion XL, 3.5 or similar) on Linux?

Really just at the beginning of my journey to run AI stuff on my Desktop. I’m on Ubuntu 25.10 with amdgpu-proprietary installed.

Thanks to the great guides at Strix Halo HomeLab I was able to read about some real-world experience running LLMs.

Did try Jan, Lemonade Server, and one of the toolboxes for LLMs and setup was very easy. Did not test many models yet, but some are quite fast. Especially compared to online services that seem to get slower during rush times (especially free tiers).

Was looking into StableDiffusion WebUI, ComfyUI, Fooocus - but no luck so far. Also still have MstyAI on my to try list.

Thought I’ll ask around if anybody can share what they are using on Linux or maybe even can share some notes / ready-made packages?

If anybody else is wondering where to start: While not exactly Stable Diffusion this thread has very useful AI llinks

Several links to other threads and external wikis / projects. Have not tried it yet, but this repo has a pre-built image and video generation toolbox for Strix Halo

Set it up a few days ago at the end of a longer AI research session but no tangible results worthwile reporting yet. Will look into it a bit more later.

Also - ROCm seems to be catching up with Linux

https://www.amd.com/en/blogs/2025/the-road-to-rocm-on-radeon-for-windows-and-linux.html

Not sure if this relates to image generation at all, but from what I understand can have better performance than Vulcan in LLMs.

Special shout out to @lhl - you are a constant source of useful information in the forum. Big thank you for pointing out projects and news for noobies like myself!

1 Like

I’ve only tried to see if things run and to observe system behavior (mainly the dreaded PSU noise) under load in this scenario, no testing on performance and optimizations. I use a pretty standard Arch setup. BTW.

I only use ComfyUI and it ran my SDXL and Flux test workloads without crashing. I’ll have to dig deeper in my spare time and run complex workloads to make use of the memory. I have little experience with other frontends, I don’t think I’ll be installing those. For Linux with ROCm installation is a little bit more technical than installing a Windows with Nvidia package, but you’ll end up with a neat python virtual environment. You’ll need python, pip and git.

Make a venv and activate it.

python -m venv .comfy-venv
source .comfy-venv/bin/activate

Install latest ROCm and torch packages from TheRock repo releases for gfx1151 target.

ROCm

pip install \`
  `--index-url https://rocm.nightlies.amd.com/v2/gfx1151/ \`
  `"rocm[libraries,devel]"

torch

  pip install \`
  `--index-url https://rocm.nightlies.amd.com/v2/gfx1151/ \`
  `--pre torch torchaudio torchvision

Check torch installation

python -c 'import torch; print(torch.cuda.is_available())'

Should return: True

Install ComfyUI using manual installation method. Docs

Clone the repo

git clone https://github.com/comfyanonymous/ComfyUI.git

GPU dependencies should already be installed from TheRock repo.
Move into ComfyUI directory and install other dependencies

cd ComfyUI
pip install -r requirements.txt

Run ComfyUI

python main.py

Server should start now, check the terminal output for available memory, pytorch version (should be a rocm one from TheRock repo, for example 2.10.0a0+rocm7.10.0a20251015), AMD arch should be gfx1151 and installed ROCm version should be reported. Device should be your Strix Halo GPU. WebUI should now be available.

You may start with an example workload, they will prompt you to download a correct model, and usually have helpful notes for working with a model architecture. Run one.

You can check if a process is using the GPU with amd-smi command. In order to use it the user has to be a part of video and render groups.

Stop the server with Ctrl+C
Exit the venv with

deactivate

If you’d like to run ComfyUI again activate the venv and run main.py.

Have fun!

Thank you so much for the in-depth instructions.

Successfully tried some models. SD XL Turbo is exceptionally fast. Also had a big model crash on me. Might be because I’ve set RAM sharing to auto. Will try again with dedicated 64GB, but need some more time for that. Internet speed is horrible to download larger models :slight_smile:

If anybody is interested I’ve put together my findings here and will keep the page updated whenever I find time to tinker with AI projects: AI on Strix Halo & Ubuntu

Edit: setting shared RAM to 64GB made no difference. When using the “Qwen-Image Text to Image” it crashes my system. Not too worried - so many templates / models to test and try still.

Tested templates so far:

  • Exceptionally fast
    • SDXL Turbo
  • Running, but not very fast
    • OmniGen2 Text to Image
    • SD.5 Simple
  • Crashed
    • Qwen-Image Text to Image
2 Likes

I had some success with ROCm nightly and ComfyUI in a custom podman container, but due to a kernel bug (fixed in 6.18), it can lead to random crashes under heavy GPU load. Looking at your homepage, my installation is similar, just isolated from the host system.

Containerfile

FROM python
RUN pip install --pre torch torchvision torchaudio --index-url https://rocm.nightlies.amd.com/v2/gfx1151/
RUN git clone https://github.com/comfyanonymous/ComfyUI
WORKDIR /ComfyUI
RUN pip install -r requirements.txt
EXPOSE 80
ENTRYPOINT python main.py --listen 0.0.0.0 --port 80

justfile

image := "comfyui"
port := "8188"
date := `date --iso-8601`
build:
    podman build -t {{image}}:{{date}} .
rebuild:
    podman build -t {{image}}:{{date}} --no-cache .
run:
    podman run -it --rm -p {{port}}:80 -v "/path/to/project-files:/ComfyUI/user/default" -v "/path/to/models:/ComfyUI/models:ro" -v "/path/to/output:/ComfyUI/output" --device=/dev/kfd --device=/dev/dri {{image}}

I’m using “just” here, but batch files would also do. My workflow right now is to have tagged container images that I can manually update and, in case of a problem, just re-tag and run a previous image. Next step might be to integrate ComfyUI Manager, because many workflows require custom nodes/plugins - that’s also why I prefer a container without a shared home folder.

I run with a fixed BIOS setting of 512MB and this kernel settings (128GB model):

amd_iommu=off ttm.pages_limit=33554432

The pages_limit makes essentially all RAM available as VRAM on demand. So far, all ROCm and Vulkan programs (including games) work as expected.

1 Like

I had no problems getting Jan and LM Studio up and running on the desktop, as well as in setting up an rocm-torch environment for ComfyUI. All attempts to generate images with Comfy caused “HIP error: Invalid device function”.

My notes indicate that I fixed this on the 16” laptop by setting "HSA_OVERRIDE_GFX_VERSION” to “10.0.3”, so I plan to give that a try when I can next spend time with the desktop. These AMD ROCM releases don’t seem to work well with the Framework hardware out of the box; there’s always some tweaking to do. In regards to the above kernel flags, for the gfx1101 in the laptop I had to do “rtc_cmos.use_acpi_alarm=1 iommu=pt amdgpu.dc=1 amd_pstate=passive amdgpu.gpu_recovery=1”, though a lot of those are to address the amdgpu kernel module crashing in Linux.

1 Like

It is actually
HSA_OVERRIDE_GFX_VERSION=“11.0.0”

on the ComyUI command line (or in the ENV). Added that and the HIP errors went away, without any kernel flags. I read somewhere, perhaps on an ROCM issue, that the gfx1151 in the Framework Desktop isn’t supposed to require this override, but eh, it does. ROCM really feels like a giant hack at times.

A quick note or addendum to the above. I upgraded to Fedora 43 and then Rawhide, in order to take advantage of the “system packages” of amdgpu and rocm as advised by AMD. It was crash city, and no combination of ROCM and Torch wheels and amdgpu-install scripts could get me out of it. I reverted to Fedora 42 (dnf distro –releasever=42 –refresh –disablerepo rawhide –enablerepo fedora –allowerasing –no-best –skip-unavailable) and got everything working again, with Fedora’s ROCM 6.3 amd amdgpu, AMD’s Torch, and HSA_OVERRIDE_GFX_VERSION=11.0.0.

So, time for some experiments, running on ‘42 via Python virtual environments, and using the Fedora 42 amdgpu module and ROCM (6.3.1) packages unless otherwise specified.

  • Torch from ROCM 6.4 dir: HIP Error (invalid device function)

  • Torch from ROCM 6.4 dir, HSA_OVERRIDE_GFX_VERSION=11.0.0 : SUCCESS

  • Torch from ROCM 7.0 dir : SUCCESS

  • Torch from ROCM 7.0 dir, HSA_OVERRIDE_GFX_VERSION=11.0.0 : SEGV

  • Torch from TheRock V2 (ROCM 7.1) : SUCCESS

  • Torch from TheRock V2 (ROCM 7.1), HSA_OVERRIDE_GFX_VERSION=11.0.0 : SEGV

  • Torch and ROCM from TheRock V2 : SUCCESS

  • Torch and ROCM from TheRock V2, HSA_OVERRIDE_GFX_VERSION=11.0.0 : SEGV

  • Torch from TheRock V2-staging (ROCM 7.1) : SUCCESS

  • Torch from TheRock V2-staging (ROCM 7.1), HSA_OVERRIDE_GFX_VERSION=11.0.0 : SEGV

  • Torch and ROCM from TheRock V2-staging (7.1) : SUCCESS

  • Torch and ROCM from TheRock V2-staging (7.1), HSA_OVERRIDE_GFX_VERSION=11.0.0 : SEGV

For future reference, when I say “Torch from ROCM 6.4 dir”, I mean this:

https://download.pytorch.org/whl/nightly/rocm6.4

and when I say “Torch from TheRock V2-staging”, I mean this:

https://rocm.nightlies.amd.com/v2-staging/gfx1151/

…and you can fill in the rest yourselves. These are passed to pip with the –index-url option.

The main point in all of the above is that in the 7.x series, not only is the HSA_OVERRIDE_GFX_VERSION directive not required, it causes segfaults. Since I had this set in an environment variable, my initial forays into using ROCM 7.x led to a long stay in segtown.

I have no explanation for why Fedora 43 also had problems, as it uses ROCM 6.4.something and HSA_OVERRIDE_GFX_VERSION was set. Maybe the system packages just don’t play well with the Torch available from AMD.

Anyways, there you have it - a record, with fixes, for those who also find themselves struggling to get AMD’s torch packages to be stable on the gfx1151, at this particular moment in time. I mean, we all know it’s going to be completely different 3 months down the line, right? :grin:

EDIT: OK, so it wasn’t all that quick.

1 Like