New video on ROCm+Linux support for AMD Strix Halo, documenting working/stable configurations in January 2026 and what caused the original issues. We finally have a fully stable platform!
I created a github actions to build llama.cpp for Strix Halo GitHub - Lychee-Technology/llama-cpp-for-strix-halo: This repository builds llama.cpp for Strix Halo devices. It is quite stable and fast to run models like GTP OSS 120B
Great news. As a relative newcomer to Linux, sorry if this is a misplaced question, but for these updates do I need a more bleeding edge distro like Fedora, or is this update applicable to Ubuntu systems as well?
I’m not sure actually, I’ll wait for people that actually use Ubuntu to chip in. I think they MIGHT backport patches sometimes, so the versions of their packages might not match mainstream versions, I have no idea honestly about Ubuntu.
I stick to Fedora as it’s the simpler Linux distribution that just works and tracks mainstream packages without too many changes.
For Ubuntu there is some element here [Issue]: [gfx1151] Incorrect VGPR count causing crashes in ROCm 6.x/7.0.x/7.1.x/7.9/7.10 on Linux · Issue #2991 · ROCm/TheRock · GitHub (I use fedora to … so don’t know really for ubuntu)
Update for fedora 43/rocm-6.4.4 is on the way: ![]()
what should I be doing here to get AI Max 395+ on Fedora to play well with ROCm/Pytorch? this is mcuking up my workflow.
I saw 7.2 was released today (rocm) but only on ubuntu/debian. I hear that it fixes a lot of problems. Right now for my data training I’m forced to use CPU only, way too slow. i don’t want to switch to ubuntu.
rocm 6.4.4 is the way to go? how soon will it be live?
Appreciate the frustration, but without knowing your workflow it’s really hard to help you.
S there anything now clear from my video that I could help you with? If you want ROCm/Pytorch on a latest kernel 6.18.4+, just use the TheRock builds:
```python -m pip install
–index-url https://rocm.nightlies.amd.com/v2-staging/gfx1151/
–pre torch torchaudio torchvision```
These work well and have the latest stuff. You do not need to install ubuntu.
(post deleted by author)
path on the way: rocm-runtime-6.4.2-3.fc43 it is on testing path for now.
If you do not want to wait you can pick up on fedora-43 with:
sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2026-e8113d69d1
![]()
Appreciate the frustration, but without knowing your workflow it’s really hard to help you.
S there anything now clear from my video that I could help you with? If you want ROCm/Pytorch on a latest kernel 6.18.4+, just use the TheRock builds:
Hi! your video was actually crystal clear and easy to understand. well done. I think the problem might be my own intelligence?
I installed the nightly in a venv 3.12.12 (had to downgrade from 3.14 python). Installed nightly rocm-sdk-libraries-gfx1151.
I am running inference on 3d model datasets using pytorch.
Your very clear advice:
linux firmware 2026010 or higher
linux kernel 6.18.4 or newer
toolbox with rocm-7-nightly
the GPU issues persist even with the venv pytorch and nightly rocm driver:
Kernel Version: 6.18.5-200.fc43.x86_64
OS: Fedora Linux 43
Firmware Build: 20260111
PyTorch version: 2.11.0a0+rocm7.11.0a20260121
ROCm/HIP: 7.2.53150
CUDA available: True
Device: Radeon 8060S Graphics (gfx1100)
Status: GPU detected but NOT functional
Error: Memory access fault (ROCm driver issue)
I think I followed your instructions pretty well, but perhaps I didn’t. I took care to ensure that I ran my GPU tests in the venv since I cannot install the nightly rocm systemwide. What am I doing wrong?
Just to clarify, you have Strix Halo gfx1151 in a Framework Desktop, right?
Yes that is what I have
Do you have any idea then why the system is reporting a totally different GPU in your output?
Device: Radeon 8060S Graphics (gfx1100)
I apologize - I had an output program that I was using to give me diagnostics and for some reason it output a different gfx number.
Straight from $ rocminfo and $ lspci
Name: AMD RYZEN AI MAX+ 395 w/ Radeon 8060S
PCI ID: 1002:1586 (Strix Halo)
Card SKU: STRXLGEN
Architecture: gfx1151
Did you ever at any point set HIP variables to override the GPU detection?
I did early on in troubleshooting because I wanted to run some inference tests on cpu only and did it in a sloppy and tired way, and I realized that as I typed last response, and I immediately fixed that. However, the memory access fault persisted. it’s a kernel-level GPU mapping issue, it seems.
I’m a self taught amateur here btw, so please be patient. I’m a physicist/material scientist out of his depth, trying to built interesting things for science…I really appreciate you trying to help! And I subscribed to your videos. You explain things so well even a physicist can understand it.
I tried to use cursor AI debug mode to help troubleshoot and was unable to resolve. I had it write a markdown that describes the situation, and I’ll make it available here:
Have you checked if the other stuff works? Like, do the llama.cpp toolboxes work?
The patch for fedora-43 / rocm6.4.4 is available for all. so “native” rocm is back working on fedora-43…
For the python + rocm-7.11…
I think I had the same probleme: did you have native fedora rocm installed?
If yes I think I have a case where “native” rocm install interfere with the python-therock.
so you can try to use a toolbox or remove native rocm …
What do you use before (rocm version, install, python package.. )? did you use the fedora python3-torch.x86_64 rpm? I never try and don’t know if it can use AMD rocm.
did not see that…
what do you have for
sudo dmesg | grep kfd
in my case:
[ 5.350519] kfd kfd: amdgpu: Allocated 3969056 bytes on gart
[ 5.350535] kfd kfd: amdgpu: Total number of KFD nodes to be created: 1
[ 5.351660] kfd kfd: amdgpu: added device 1002:1586
Note: there is no amdkfd driver it is part of amdgpu
### Current System State:
- âś… ROCm user-space: Installed (ROCm 6.4.2 + some 7.1.1 packages)
- âś… PyTorch ROCm: Installed (ROCm 7.11 nightly from gfx1151)
so I thing it is the probleme: 2 rocm release installed. ![]()