Heh, turned out VLLM compiles and runs just fine without NVidia-provided container. Just needed to set an environment variable specifying the arch.
In other news, there is some new activity in amd-dev branch of vllm project, so hopefully some improvements are coming in 0.11.1 release. But amdsmi python package is still crashing, so there is that.
what strang is that the one from fedora 42 work.
$ amd-smi version
AMDSMI Tool: 24.7.1+unknown | AMDSMI Library version: 24.7.1.0 | ROCm version: N/A
no, not this one. amdsmi Python module, and only on cleanup. amd-smi commandline tool works.
look there is someting wrong with amd-smi for rocm-7+:
- fedora 42 / rocm 6.3:
$ amd-smi list
GPU: 0
BDF: 0000:c1:00.0
UUID: 00ff1586-0000-1000-8000-000000000000
KFD_ID: 29672
NODE_ID: 1
PARTITION_ID: 0
- fedora 43 / rocm 6.4:
$ amd-smi list
WARNING: User is missing the following required groups: render, video. Please add user to these groups.
GPU: 0
BDF: 0000:c1:00.0
UUID: 00ff1586-0000-1000-8000-000000000000
KFD_ID: 29672
NODE_ID: 1
PARTITION_ID: 0
- fedora 44 / rocm 7.0:
$ amd-smi list
WARNING: User is missing the following required groups: render, video. Please add user to these groups.
GPU: 0
BDF: N/A
UUID: N/A
KFD_ID: 29672
NODE_ID: 1
PARTITION_ID: 0
Note: all done on toolbox runing on silverbue 42 …
Strix Halo on Framework MB:
- FA: on
- mmap: off
- GGML_CUDA_ENABLE_UNIFIED_MEMORY=ON
- ngl: 999
- n_ubatch=4096
- backend: rocm
| model | size | params | test | t/s |
|---|---|---|---|---|
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | pp1 | 4.69 ± 0.00 |
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | pp1 | 4.69 ± 0.00 |
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | pp2 | 9.19 ± 0.00 |
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | pp3 | 11.23 ± 0.00 |
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | pp4 | 12.86 ± 0.01 |
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | pp8 | 25.37 ± 0.04 |
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | pp12 | 37.53 ± 0.06 |
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | pp16 | 49.17 ± 0.08 |
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | pp24 | 70.87 ± 0.15 |
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | pp32 | 89.94 ± 0.45 |
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | pp48 | 122.01 ± 0.61 |
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | pp64 | 145.84 ± 0.60 |
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | pp96 | 207.52 ± 0.55 |
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | pp128 | 269.40 ± 0.95 |
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | pp192 | 229.28 ± 0.15 |
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | pp256 | 291.95 ± 0.70 |
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | pp384 | 358.48 ± 0.89 |
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | pp512 | 418.56 ± 0.65 |
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | pp768 | 401.40 ± 1.40 |
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | pp1024 | 438.28 ± 1.35 |
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | pp1536 | 439.35 ± 0.80 |
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | pp2048 | 438.40 ± 1.04 |
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | pp3072 | 432.32 ± 0.48 |
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | pp4096 | 423.00 ± 0.47 |
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | tg16 | 4.69 ± 0.00 |
| Mistral-Small-2506 | 43.91 GiB | 23.57 B | pp512+tg64 | 38.69 ± 0.01 |
The user in toolbox is not a member of the required groups.
:~$ ll /dev/kfd
crw-rw-rw-. 1 root render 235, 0 25 oct. 11:21 /dev/kfd
:~$ ll /dev/dri/renderD128
crw-rw-rw-. 1 root render 226, 128 25 oct. 11:21 /dev/dri/renderD128
it is not needed with fedora, there is rw for all user. So the chech is “wrong”. I never need it on this OS. (may be needed on server / coreOS release?)
and rocm work fine without.
⬢ [zzzzzz@toolbx ~]$ getfacl /dev/dri/card1
getfacl : suppression du premier « / » des noms de chemins absolus
# file: dev/dri/card1
# owner: nobody
# group: nobody
user::rw-
user:4294967295:rw-
group::rw-
mask::rw-
other::---
and have user ACL right too. on cardN.
I try to add groups:
sudo usermod -a -G video,render $LOGNAME
But it did not work , did not add user to the groups.
Edit: find how to add user in this groups on host but not how to have them in toolbox. I have to look what is th “good” way for that (and after if is is realy needed…)