Installing ROCm / HIPLIB on Ubuntu 22.04

cepth · May 7, 2024, 7:24am

I’ve been able to get PyTorch + Tensorflow (as well as TinyGrad) working on the Framework 16.

Assuming you have the discrete GPU module, I would do the following:

Use amdgpu-install. Personally, I installed a number of the additional packages, so my install command looked like: sudo amdgpu-install --usecase=graphics,rocm,hip,mllib. This will require a reboot after installation. I believe including graphics as a usecase should preserve external monitor support (mine work fine).
Verify that you can run rocm-smi (which is the AMD equivalent to nvidia-smi).
Optional (but recommended), install amdgpu_top. This will require that have you Rust installed, but it gives you a very nice interface, and far more detailed information than the vanilla rocm-smi.
I would recommend a good environment manager. Personally, I use mambaforge/miniforge. The solver (for dependencies) is way faster than vanilla Conda.
Create an environment (whether through mambaforge or something simpler, like venv), and activate it.
Install PyTorch from this official ROCm repo. I still have ROCm 6.0.2 installed, but I assume if you download and run amdgpu-install today, it’ll likely use 6.1. Be sure to choose the .whl (Python wheel) from the correct folder, corresponding to the PyTorch edition you’re looking for.
I’ve found that Ubuntu doesn’t always use the discrete GPU. And, officially ROCm does not support the RX 7700S. So, I’ve found that it’s helpful to add the following environment variables to your .bashrc (or you can export these one time):

export HSA_OVERRIDE_GFX_VERSION=11.0.0
export HCC_AMDGPU_TARGET=gfx1100
export PYTORCH_ROCM_ARCH=gfx1100
export TRITON_USE_ROCM=ON

export ROCM_PATH=/opt/rocm-6.0.2
export ROCR_VISIBLE_DEVICES=0
export HIP_VISIBLE_DEVICES=0
export USE_CUDA=0

What do these mean?

If you have a newer version of ROCm, change the ROCM_PATH line to reflect the correct path.
The visible devices lines are so that PyTorch (and other frameworks) use the 7700S, and not the iGPU.
gfx1100 is the architecture for the RX 7900XTX, which is to date the only consumer card that ROCm officially supports. The 7700S has a “real” architecture designation of gfx1102, and if you don’t modify that environment variable, ROCm PyTorch will error out.

Clone this official ROCm benchmarking repo, and run the benchmarks.

Be sure that your Python environment with ROCm PyTorch is active in your terminal.

For example, with my power profile set to “performance” and my laptop plugged in, I run (within that benchmarking repo folder):

python3 micro_benchmarking_pytorch.py --network resnext101 --batch-size 32 --iterations 400 --fp16 1 --compile

(The compile flag activates PyTorch’s pre-compile function, which requires overhead/setup time, but leads to faster runs).

I get this terminal output:

INFO: running forward and backward for warmup.
INFO: running the benchmark..
OK: finished running benchmark..
--------------------SUMMARY--------------------------
Microbenchmark for network : resnext101
Num devices: 1
Dtype: FP16
Mini batch size [img] : 32
Time per mini-batch : 0.365741006731987
Throughput [img/sec] : 87.49360725484475

This should get you up and running with PyTorch. If, for whatever reason, you want to get TensorFlow working, I can dig up my notes. But, be warned that it seems ROCm support for TensorFlow is in a much worse state than PyTorch. For example, you have to install the nightly release. The mainline TensorFlow release will completely fail to run on the 7700S.

Addendum:

I should credit this blog post for some ideas that helped me get my setup working. Note that their setup was smoother/simpler because they’re using an officially supported GPU (the desktop 7900 XTX), and they’re running a much older version of ROCm (5.5).

Topic		Replies	Views
AMD ROCm for local training and inferencing Framework Laptop 16 framework-laptop-16-amd-7040	2	918	September 29, 2024
Fedora amdgpu-install Framework Laptop 16 framework-laptop-16-amd-7040	6	1234	March 13, 2024
Stable Diffusion / ROCm / PyTorch Setup Linux ubuntu	17	4440	February 14, 2025
Experiments with using ROCM on the FW16 AMD Linux ubuntu	9	529	February 1, 2025
Updating driver package to get ROCm support on the Framework 16 Framework Laptop 16 feature-requests , framework-laptop-16-amd-7040	4	424	November 13, 2024

Installing ROCm / HIPLIB on Ubuntu 22.04

Related topics