It seems that Fedora39 conflicts with the installation of
amdgpu-install-6.0.60002-1718218.22.noarch.rpm
with
Error: Transaction test error:
file /usr/bin from install of amdgpu-install-6.0.60002-1718218.22.noarch conflicts with file from package filesystem-3.18-6.fc39.x86_64
So the question is: How can we make sure we can use ROCM with mobile AMD GPU in Framework 16 under Fedora39?
Examples in Python would be helpful.
Thank you Mario.
The article is nice while it misses some elements for cupy (trying to find resources to get this last piece working).
Thank you again
After following all the steps of the article, I do get amd gpu ids not found.
I have try to set all the environment variables I could find on the web to point to GPU ID#0 since the framework has the additional GPU in first. Still get the same when running pytorch rocm 6.0.
Any idea?
I didn’t set any env variables and it just works for me. Not sure…
So I’m getting somewhere.
Rocm 6.0 does not manage properly the hybrid cards in my machine (under Fedora 40), whereas Rocm5.7 does perfectly well.
Here are the details:
Projects/learning-python/pytorch via 🐍 v3.12.2
❯ python
Python 3.12.2 (main, Feb 21 2024, 00:00:00) [GCC 14.0.1 20240217 (Red Hat 14.0.1-0)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print(torch.__version__)
2.2.1+rocm5.7
>>> torch.cuda.is_available()
True
>>> torch.cuda.device_count()
1
>>> torch.cuda.current_device()
0
>>> torch.cuda.device(0)
<torch.cuda.device object at 0x7f913c3b2360>
>>> torch.cuda.get_device_name(0)
'AMD Radeon RX 7700S'
>>> quit()
Projects/learning-python/pytorch via 🐍 v3.12.2 took 57s
❯ source .venv/bin/activate
Projects/learning-python/pytorch via 🐍 v3.12.1 (.venv)
❯ python
Python 3.12.1 | packaged by Anaconda, Inc. | (main, Jan 19 2024, 15:51:05) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
amdgpu.ids: No such file or directory
amdgpu.ids: No such file or directory
>>> print(torch.__version__)
2.3.0.dev20240312+rocm6.0
>>> torch.cuda.is_available()
True
>>> torch.cuda.device_count()
1
>>> torch.cuda.device(0)
<torch.cuda.device object at 0x7f0ff2000830>
>>> torch.cuda.get_device_name(0)
'AMD Radeon Graphics'
>>> quit()
So I guess I have to stick to Rocm 5.7 for now.
Ah that might explain the difference. I tested on Fedora 39 after all. Please file a bug with upstream, this sounds like a ROCm bug from your findings.