AMD ROCm does not support the AMD Ryzen AI 300 Series GPUs

WSL is a VM, you did not get full GPU access without dedicated it to the VM (only Serveur GPU support Virtualisation.) on a fw13 you have only 1 GPU so you can’t dedicated it to WSL.
For rocm support There is a need for special driver for the kfd and amdgpu driver support. I don’t know if AMD can do it or if there ise some Microsoft support needed for that (if even possible)

so try OS that support it like fedora 42 SIGs/HC - Fedora Project Wiki

this is the “magic” of open source.
for NVIDIA with here full close source not supported by NVIDIA mean not working. For OpenSource it is not the case, AMD have support for some hw and linux distrib, but there is some support from other distrubution by the distribution, it is now the case for Fedora and may be from other (Arch?)

1 Like

You’re right, but AMD still doesn’t support ROCm on the AI 300. I’m running Fedora and I can only use Vulkan.

I built a version of latest from AMD’s TheRock repo. It looks like it does now support the gfx1150 (version in the FW13). I can seem to run comfyui with WAN2.1, but both WAN2.2 and QWEN fail to work. In the case of WAN2.2 it’ll kill the terminal (unresponsive) and in the case of QWEN it drops me back to the login prompt on fedora43 (rawhide). Anyone get any further than me or am I trailblazing here?

What does dmesg show when it crashes?

[ 1552.620116] amdgpu 0000:c1:00.0: amdgpu: Dumping IP State
[ 1552.620989] amdgpu 0000:c1:00.0: amdgpu: Dumping IP State Completed
[ 1552.621050] amdgpu 0000:c1:00.0: amdgpu: [drm] AMDGPU device coredump file has been created
[ 1552.621052] amdgpu 0000:c1:00.0: amdgpu: [drm] Check your /sys/class/drm/card1/device/devcoredump/data
[ 1552.621053] amdgpu 0000:c1:00.0: amdgpu: ring gfx_0.0.0 timeout, signaled seq=25933, emitted seq=25934
[ 1552.621056] amdgpu 0000:c1:00.0: amdgpu: Process ptyxis pid 4049 thread ptyxis pid 4049
[ 1552.621058] amdgpu 0000:c1:00.0: amdgpu: Starting gfx_0.0.0 ring reset
[ 1554.624900] amdgpu 0000:c1:00.0: amdgpu: MES failed to respond to msg=RESET
[ 1554.624907] amdgpu 0000:c1:00.0: amdgpu: failed to reset legacy queue
[ 1554.624909] amdgpu 0000:c1:00.0: amdgpu: reset via MES failed and try pipe reset -110
[ 1554.624911] amdgpu 0000:c1:00.0: amdgpu: The CPFW hasn’t support pipe reset yet.
[ 1554.624912] amdgpu 0000:c1:00.0: amdgpu: Ring gfx_0.0.0 reset failed
[ 1554.624915] amdgpu 0000:c1:00.0: amdgpu: GPU reset begin!
[ 1554.624943] amdgpu 0000:c1:00.0: amdgpu: Failed to evict queue 2
[ 1554.624944] amdgpu 0000:c1:00.0: amdgpu: Failed to evict queue 1
[ 1554.624946] amdgpu 0000:c1:00.0: amdgpu: Failed to evict queue 0
[ 1554.624947] amdgpu: Failed to suspend process pid 5742
[ 1554.628612] [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Failed to initialize parser -125!
[ 1556.714994] amdgpu 0000:c1:00.0: amdgpu: MES failed to respond to msg=REMOVE_QUEUE
[ 1556.715002] amdgpu 0000:c1:00.0: amdgpu: failed to unmap legacy queue
[ 1557.022593] [drm:gfx_v11_0_hw_fini [amdgpu]] ERROR failed to halt cp gfx
[ 1557.024089] amdgpu 0000:c1:00.0: amdgpu: MODE2 reset
[ 1557.046205] amdgpu 0000:c1:00.0: amdgpu: GPU reset succeeded, trying to resume
[ 1557.046541] [drm] PCIE GART of 512M enabled (table at 0x0000008000F00000).
[ 1557.046584] amdgpu 0000:c1:00.0: amdgpu: SMU is resuming…
[ 1557.050724] amdgpu 0000:c1:00.0: amdgpu: SMU is resumed successfully!
[ 1557.060042] amdgpu 0000:c1:00.0: amdgpu: [drm] DMUB hardware initialized: version=0x09002600
[ 1557.233262] amdgpu 0000:c1:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[ 1557.233273] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[ 1557.233275] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[ 1557.233277] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
[ 1557.233278] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
[ 1557.233279] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
[ 1557.233280] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
[ 1557.233282] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
[ 1557.233283] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
[ 1557.233284] amdgpu 0000:c1:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[ 1557.233286] amdgpu 0000:c1:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
[ 1557.233287] amdgpu 0000:c1:00.0: amdgpu: ring jpeg_dec_0 uses VM inv eng 1 on hub 8
[ 1557.233288] amdgpu 0000:c1:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0
[ 1557.233290] amdgpu 0000:c1:00.0: amdgpu: ring vpe uses VM inv eng 4 on hub 8
[ 1557.249872] amdgpu 0000:c1:00.0: amdgpu: GPU reset(1) succeeded!
[ 1557.249885] amdgpu 0000:c1:00.0: [drm] device wedged, but recovered through reset
[ 1923.031978] rfkill: input handler enabled
[ 1923.067747] fbcon: Taking over console
[ 1923.078261] Console: switching to colour frame buffer device 282x94
[ 1926.957324] usb 1-1: reset full-speed USB device number 2 using xhci_hcd

I have been experimenting to ROCm on a FW16 7840HS.

The most recent experiments have been with TheRock. It used ROCm version 7.x

I was able to compile ROCm (The TheROCK version) for my particular GPU.

I had to make sure all other ROCm was removed from my system, otherwise the compile fails.

My tests have been with complex number matrix multiply using ROCBLAS.

I get similar failures that you get.

I am still trying to debug the GPU code. I have not found any good tools yet.

I did find something out that might help you.

If I hipMalloc a 50000 x 50000 matrix

and then use cgemm to multiple the 50000 x 50000 matrix, it crashes, similar to what you see.

If I hipMalloc a 50512 x 50512 matrix

and then use cgemm to multiple the 50000 x 50000 matrix, it works.

So, I think there is a buffer overflow bug in ROCM.

I mention this, because if you also over hipMalloc like me, at least your application might finish and work. Seems an OK workaround for now.

1 Like

Unsure if this helps, but running Ollama and Deepseek R1 on Alpaca in Fedora, I’ve never succeeded in getting it to use the GPU (a 3950X+6900XT or a 7840U). I really think that sitting out a hardware generation or two will save folks a lot of headaches, as we’re in such a rapid state of transformation, particularly with consumer inference hardware, that ponying up for new and shiny stuff will be really disappointing even if it works.

ROCm 7.0 is officially out, has anyone done any tests on Strix Point or the new AI 300 lineup?

Near as I can tell, ROCm 7.0 doesn’t officially support either Strix Point or Strix Halo.

1 Like

Apparently the AI 300 are also not supported. It’s such a farce that they put AI into the names.

4 Likes

fwiw, looks like they just merged in basic support to the upstream release branch: [hipblas] Add gfx1150, gfx1151, gfx1200 and gfx1201 support (#1334) · ROCm/rocm-libraries@1f98c0d · GitHub

:crossed_fingers: hope this means we may see this in an upcoming release.

Update:

Looks like it’s also been added to their roadmap, so its now being tracked

6 Likes

Just when I was complaining. Maybe it works and I should do that more often :rofl:

1 Like

I’m looking forward to a Intel Core 200/Core 300 mainboard for framework 13. I didn’t upgrade to AI 300 because software support takes so long to arrive, or never arrives at all. My 7900XTX was so hard to make accelerate pytorch… And I’m still on WSL 6.4 because it’s linux only the 7.0 update.

Intel makes OpenVINO, now I’m testing it on the N100 series chips for my robot.

Unlike AMD, Intel lists support even for their UHD integrated graphics, and Vulkan acceleration worked out of the box, so far I have gotten 3T/s on Qwen 3 4B Q4M with no effort on the N100 8GB.

Vulkan works great on all the AMD mainboards too. Is AMD being held to a different standard?

I think there is excitement for feature parity, and they just released all these “AI” labeled boards. But, to your point, the Vulkan integration works really well and works now. Also, it looks like it performs better than ROCm for now, but ymmv.

Runtimes:

  • llama.cpp Vulkan: YES. It powers all LLMs fine and works great on AMD, Intel and Nvidia
  • ONNX CUDA: YES doesn’t have iGPU
  • Torch CUDA: YES doesn’t have iGPU
  • ONNX ROCm: No for the 760m
  • Torch ROCm: No for the 760m
  • ONNX OpenVINO: YES for UHD
  • Torch OpenVINO: YES for UHD

As you can see it’s not AMD that is being held to a different standard. It’s AMD that doesn’t bother to make binaries for their GPUs, and that isn’t just the iCPU like the 760m, all their boards take even years to get ML acceleration, and they do so under Linux primarly

AMD is the outlier here.

1 Like

i would STRONGLY caution against buying specialized AI hardware until you see people using the software that you want to use on that specific hardware.

1 Like

That is exactly what I am doing, I am still on a 7840U. I am salty about products being marketed for AI and then not working for this purpose for years. If ROCm would work, I’d buy a HX 370.

1 Like

Starting with ROCm 6.4.4, AMD has expanded (preview) support for PyTorch on Ryzen APUs.

For folks not yet on the latest ROCm (e.g., if you’re running Bluefin GTS, my distro of choice), AMD also publish rocm/pytorch Docker images that fully utilize the iGPU:

$ docker container exec -it beautiful_johnson /bin/bash

ubuntu@3c37a46b0878:/machine-learning-with-python$ source /opt/venv/bin/activate
(venv) ubuntu@3c37a46b0878:/machine-learning-with-python$ python
Python 3.12.3 (main, Aug 14 2025, 17:47:21) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.is_available()
True
>>>

Make sure you grab an image for the exact PyTorch version you want here. You could use the image directly, of course. Preferably, you’d create a devcontainer based off your chosen rocm/pytorch image for improved tooling integration.

Finally, LM Studio works perfectly as well (via Vulkan llama.cpp v1.52.0 v1.50.2 runtime). I have it installed via their official AppImage (0.3.27 Build 4). Using LM Studio, I’m getting 20-23 tok/sec (impressive!) using the openai/gpt-oss-20b model.

Here’s my system config:

Model: Laptop 13 (AMD Ryzen AI 300 Series) (A9)
CPU: AMD Ryzen AI 9 HX 370 (24) @ 5.16 GHz
GPU: AMD Radeon 890M [Integrated]
Memory: 96 GB; 32 GB dedicated to iGPU in BIOS
Distro: Bluefin-dx:gts (Version: gts-41.20250928)
Kernel: Linux 6.15.10-100.fc41.x86_64

So, if you need AI chops in a FwL 13 chassis, I’d not wait on the basis of “non-existent” local AI support because it’s perfectly usable for that use case right now, and it’s only going to improve from here on out. That said, this support was not available out the gate which is a concerning path many vendors seem to be taking (AMD and Apple come to mind).

I hope this helps.

4 Likes

Thank you for the information. Do you know why the docker container with ROCm 7.0 does seem to work with hardware acceleration while ROCm 7.0 to install directly on the OS is apparently not ready. I was under the assumption that containers would come out after the release for direct installation on the OS or maybe I am misunderstanding the situation?!

This made the 10 year old in me chuckle.

1 Like