Background: I own a 16 inch – but I essentially ripped out the GPU (which, was easy enough to do in a framework laptop, for sure, though it was definitely not my original intent ) I did this because, frankly, I was not smart enough to figure out how to use the GPU for general purpose computation (not necessarily LLMs, just any kind of GPU work – a la CUDA).
At the launch yesterday: somebody senior at AMD (responsible for CUDA-equivalent stuff, I think it was @AnushElangovan) mentioned this: https://github.com/nod-ai/TheRock repo which is supposedly the answer to ROCm questions going forward. I have no idea who nod-ai are and how/if they actually relate to AMD, but the utter lack of any tutorial / detailed plan for leveraging the GPUs / AI Max makes me seriously doubt that I would be able to take advantage of the Desktop for hobbyist computational work (again, not necessarily LLMs). This is a shame, my original plan was to buy “one of each” Framework products (I already have a 13 inch as well)… I’m a big fan of framework’s big-picture mission statement.
Punch Line: for the time being, I am really hesitating to order a Desktop… I want to “early adopt” and all that, but the ROCm story is just too flimsy.
Secondly, it was stunning how out-of-touch that AMD lead was with how bad the ROCm experience is for anyone using a consumer GPU that is not an 7900XTX. He also talked up some awesome CI and contribution model. Maybe that exists internally or for large enterprise customers. Not for us normal folks though.
Their software development processes really seem lacking. (based on the kinds of bugs and hacks that you see escape)
Seeding 100x desktops to open source folks is great. Full stop. But realize this cost is still less than 1-year of hiring a single FTE to help fix things. And they clearly need to hire more than that.
The nod.ai team was acquired by AMD, and they have been talking a good talk. But so far (at least publicly) there are no results.
In conclusion:
Since the success of Desktop now hinges on ROCm, I hope the Framework team can impress upon AMD the need to fix the ROCm SW ecosystem and hire some actually good software architects (and give them the funds + authority) to improve their overall SW development practices.
@James3 Your thread illustrates the issue exactly. This is so janky.
FYI, the build you used from AMD did not bother to include the chip-specific files you needed for your FW16 GPU. Yes using “export HSA_OVERRIDE_GFX_VERSION=11.0.0” can fool the code in to running but it will not be stable (as there are different chip bugs the GPU-specific files will work around) nor performant (as the tuning for your chip’s caches and execution latency are unknown.).
Newer ROCm builds (and third party builds) may have the dat files for performance tuning, but it’s questionable how accurate those are nor does it address the stability issues. Those need to be coded in the bundled internal compiler.
Wow, HSA. That’s quite the blast from the past there still lurking in their code. Heterogenous Systems Architecture, what they’ve been trying to get people to use since the Kaveri days of APUs.
Note the absence Strix Halo (Max+ 395) as used by Desktop. Yes it’s true Max+ 395 is new, but nvidia somehow manages day-1 support for their hardware despite being closed-source and thus no community help/testing.
Worse, note the lack of any of the GPU options for FW16. Neither the integrated nor dedicated options have any official ROCm support or testing. And this hardware has been out for a long time now.
I agree. If AMD really are supporting FW with the FW Desktop for AI / ML / LLM models , applications etc. then they (AMD) do need to add the AMD APUs to the officially supported ROCM column.
I fully agree !
I won’t buy any AMD hardware for working on LLM until they support of all their AI based product lines with ROCm. My FW16 with 7700S is waiting.
If it were the case, I would have been certainly among the preordering people of the FW desktop.
Similar topic here : Status of AMD NPU Support - #9 by S.H
I used ROCm with RX6700XT and 7800XT so far and it works, but needs the HSA… hack. From the posts in my thread it looks like onnx is the only way to leverage NPU units.
At the FW 2nd Event AMD were also on stage:
Anush Elangovan - VP of AI software at AMD
Talked about ROCM.
2023 - Focused on day zero support of Model (LLAMA, DeepSeek)
2024 - performant day zero support.
2025 - Focus on accessibility of ROCM
We get pytorch to work on all of AMDs AI hardware.
Starting from the Laptops, to Desktops, to the Instint GPUs.
So, if ROCM and pytorch does not work, it sounds like better support might be coming in 2025.
I consider this a hack since it fools ROCm to think it is using an RX7900XTX.
I think for the RX6700XT it was HSA_OVERRIDE_GFX_VERSION=10.3.0 but I am not sure since it has been a while since I used the RX6700XT.
AMD ROCm™ is an open software stack including drivers, development tools, and APIs that enable GPU programming from low-level kernel to end-user applications. ROCm is optimized for Generative AI and HPC applications, and is easy to migrate existing code into.
I am not sure that AMD NPU will be included in ROCm.
Hmm, I recently install Ollama and Invoke.Ai on my Arch based machine with an RX6800 and I’ve not had to mess with this variable, things seem to be running just fine and using radeontop and btop I’ve confirmed that during generation, my CPU is largely idle.
There is a touch of instability in the Invoke app… sometimes after I’ve thrown a lot into the render pipeline the UI will crash, but it’s not frequent enough to really bother me and it seems to recover well… mostly, I think it’s just the Invoke UI rather than rocm… as the majority of the time when I relaunch the UI, the GPU is still working on the output.
Your mileage may vary if you’re using Windows with WSL, if that’s the case?
I’m using fedora linux. But RX6800 is not the same as RX6700XT in this regard. As far as I remember RX6800 is gfx1030 where as RX6700XT is gfx1031 (which is/was not included in ROCm). But I may be wrong.
ROCM 6.3.3 seems to like the following GPUs, others need the HSA override:
“/opt/rocm-6.3.3/lib/rocblas/library/TensileLibrary_lazy_gfx1010.dat”
“/opt/rocm-6.3.3/lib/rocblas/library/TensileLibrary_lazy_gfx1012.dat”
“/opt/rocm-6.3.3/lib/rocblas/library/TensileLibrary_lazy_gfx1030.dat”
“/opt/rocm-6.3.3/lib/rocblas/library/TensileLibrary_lazy_gfx1100.dat”
“/opt/rocm-6.3.3/lib/rocblas/library/TensileLibrary_lazy_gfx1101.dat”
“/opt/rocm-6.3.3/lib/rocblas/library/TensileLibrary_lazy_gfx1102.dat”
“/opt/rocm-6.3.3/lib/rocblas/library/TensileLibrary_lazy_gfx1151.dat”
“/opt/rocm-6.3.3/lib/rocblas/library/TensileLibrary_lazy_gfx1200.dat”
“/opt/rocm-6.3.3/lib/rocblas/library/TensileLibrary_lazy_gfx1201.dat”
“/opt/rocm-6.3.3/lib/rocblas/library/TensileLibrary_lazy_gfx900.dat”
“/opt/rocm-6.3.3/lib/rocblas/library/TensileLibrary_lazy_gfx906.dat”
“/opt/rocm-6.3.3/lib/rocblas/library/TensileLibrary_lazy_gfx908.dat”
“/opt/rocm-6.3.3/lib/rocblas/library/TensileLibrary_lazy_gfx90a.dat”
“/opt/rocm-6.3.3/lib/rocblas/library/TensileLibrary_lazy_gfx942.dat”
Thanks for the info… I just assumed that they’d just include all 6x00 and 7x00 series cards… but seems not. This is definitely something they should expand on.
Agreed 100%. And it’s not just the missing .dat files which other posters have pointed out are included on newer versions. FYI on the newer builds you can get away without the HSA override for more RDNA3 chips.
The bottom line is that there is reason that AMD does not list these GPUs on the supported list. They simply have not yet worked out all the bugs/issues. Nor are they regularly testing for said issues.
It’s good they’ve released the partial support. It’s bad they’ve not actually finished the job.