I’ve been able to get PyTorch + Tensorflow (as well as TinyGrad) working on the Framework 16.
Assuming you have the discrete GPU module, I would do the following:
- Use
amdgpu-install
. Personally, I installed a number of the additional packages, so my install command looked like:sudo amdgpu-install --usecase=graphics,rocm,hip,mllib
. This will require a reboot after installation. I believe includinggraphics
as a usecase should preserve external monitor support (mine work fine). - Verify that you can run
rocm-smi
(which is the AMD equivalent tonvidia-smi
). - Optional (but recommended), install amdgpu_top. This will require that have you Rust installed, but it gives you a very nice interface, and far more detailed information than the vanilla
rocm-smi
. - I would recommend a good environment manager. Personally, I use mambaforge/miniforge. The solver (for dependencies) is way faster than vanilla Conda.
- Create an environment (whether through mambaforge or something simpler, like
venv
), and activate it. - Install PyTorch from this official ROCm repo. I still have ROCm 6.0.2 installed, but I assume if you download and run
amdgpu-install
today, it’ll likely use 6.1. Be sure to choose the.whl
(Python wheel) from the correct folder, corresponding to the PyTorch edition you’re looking for. - I’ve found that Ubuntu doesn’t always use the discrete GPU. And, officially ROCm does not support the RX 7700S. So, I’ve found that it’s helpful to add the following environment variables to your
.bashrc
(or you can export these one time):
export HSA_OVERRIDE_GFX_VERSION=11.0.0
export HCC_AMDGPU_TARGET=gfx1100
export PYTORCH_ROCM_ARCH=gfx1100
export TRITON_USE_ROCM=ON
export ROCM_PATH=/opt/rocm-6.0.2
export ROCR_VISIBLE_DEVICES=0
export HIP_VISIBLE_DEVICES=0
export USE_CUDA=0
What do these mean?
- If you have a newer version of ROCm, change the
ROCM_PATH
line to reflect the correct path. - The visible devices lines are so that PyTorch (and other frameworks) use the 7700S, and not the iGPU.
gfx1100
is the architecture for the RX 7900XTX, which is to date the only consumer card that ROCm officially supports. The 7700S has a “real” architecture designation ofgfx1102
, and if you don’t modify that environment variable, ROCm PyTorch will error out.
- Clone this official ROCm benchmarking repo, and run the benchmarks.
Be sure that your Python environment with ROCm PyTorch is active in your terminal.
For example, with my power profile set to “performance” and my laptop plugged in, I run (within that benchmarking repo folder):
python3 micro_benchmarking_pytorch.py --network resnext101 --batch-size 32 --iterations 400 --fp16 1 --compile
(The compile flag activates PyTorch’s pre-compile function, which requires overhead/setup time, but leads to faster runs).
I get this terminal output:
INFO: running forward and backward for warmup.
INFO: running the benchmark..
OK: finished running benchmark..
--------------------SUMMARY--------------------------
Microbenchmark for network : resnext101
Num devices: 1
Dtype: FP16
Mini batch size [img] : 32
Time per mini-batch : 0.365741006731987
Throughput [img/sec] : 87.49360725484475
This should get you up and running with PyTorch. If, for whatever reason, you want to get TensorFlow working, I can dig up my notes. But, be warned that it seems ROCm support for TensorFlow is in a much worse state than PyTorch. For example, you have to install the nightly release. The mainline TensorFlow release will completely fail to run on the 7700S.
Addendum:
- I should credit this blog post for some ideas that helped me get my setup working. Note that their setup was smoother/simpler because they’re using an officially supported GPU (the desktop 7900 XTX), and they’re running a much older version of ROCm (5.5).