Installing ROCm / HIPLIB on Ubuntu 22.04

Hey folks - I’m loving my FrameWork so far, but wanted to get an idea of if anyone has any good tutorials for getting ROCm and HIP installation done on the Framework 16 for Ubuntu 22.04.

I’m currently running into issues with the amdgpu-install tool that AMD provides - it’s not installing the amdgpu-dkms package. I am sure I can resolve this with enough time ( I have a rough idea of what’s going on there ), but what I want to know is:

  1. Does anyone have an installation guide for CuPy / PyTorch that they have successfully used here? I am a bit out of my depth when it comes to understanding driver installations, and any reference ( even if it is literally a manual to RTFM ) would help.
  2. Does anyone know how to do this without disabling external monitors? When I ran amdgpu-install with the usecases I wanted, it disabled my external monitor support. I imagine this is because I didn’t specify the expected usecases, but I’d love to know if there’s a good way to make sure that this works.

Thank you all so much in advance!

I’ve been able to get PyTorch + Tensorflow (as well as TinyGrad) working on the Framework 16.

Assuming you have the discrete GPU module, I would do the following:

  1. Use amdgpu-install. Personally, I installed a number of the additional packages, so my install command looked like: sudo amdgpu-install --usecase=graphics,rocm,hip,mllib. This will require a reboot after installation. I believe including graphics as a usecase should preserve external monitor support (mine work fine).
  2. Verify that you can run rocm-smi (which is the AMD equivalent to nvidia-smi).
  3. Optional (but recommended), install amdgpu_top. This will require that have you Rust installed, but it gives you a very nice interface, and far more detailed information than the vanilla rocm-smi.
  4. I would recommend a good environment manager. Personally, I use mambaforge/miniforge. The solver (for dependencies) is way faster than vanilla Conda.
  5. Create an environment (whether through mambaforge or something simpler, like venv), and activate it.
  6. Install PyTorch from this official ROCm repo. I still have ROCm 6.0.2 installed, but I assume if you download and run amdgpu-install today, it’ll likely use 6.1. Be sure to choose the .whl (Python wheel) from the correct folder, corresponding to the PyTorch edition you’re looking for.
  7. I’ve found that Ubuntu doesn’t always use the discrete GPU. And, officially ROCm does not support the RX 7700S. So, I’ve found that it’s helpful to add the following environment variables to your .bashrc (or you can export these one time):
export HSA_OVERRIDE_GFX_VERSION=11.0.0
export HCC_AMDGPU_TARGET=gfx1100
export PYTORCH_ROCM_ARCH=gfx1100
export TRITON_USE_ROCM=ON

export ROCM_PATH=/opt/rocm-6.0.2
export ROCR_VISIBLE_DEVICES=0
export HIP_VISIBLE_DEVICES=0
export USE_CUDA=0

What do these mean?

  • If you have a newer version of ROCm, change the ROCM_PATH line to reflect the correct path.
  • The visible devices lines are so that PyTorch (and other frameworks) use the 7700S, and not the iGPU.
  • gfx1100 is the architecture for the RX 7900XTX, which is to date the only consumer card that ROCm officially supports. The 7700S has a “real” architecture designation of gfx1102, and if you don’t modify that environment variable, ROCm PyTorch will error out.
  1. Clone this official ROCm benchmarking repo, and run the benchmarks.

Be sure that your Python environment with ROCm PyTorch is active in your terminal.

For example, with my power profile set to “performance” and my laptop plugged in, I run (within that benchmarking repo folder):

python3 micro_benchmarking_pytorch.py --network resnext101 --batch-size 32 --iterations 400 --fp16 1 --compile

(The compile flag activates PyTorch’s pre-compile function, which requires overhead/setup time, but leads to faster runs).

I get this terminal output:

INFO: running forward and backward for warmup.
INFO: running the benchmark..
OK: finished running benchmark..
--------------------SUMMARY--------------------------
Microbenchmark for network : resnext101
Num devices: 1
Dtype: FP16
Mini batch size [img] : 32
Time per mini-batch : 0.365741006731987
Throughput [img/sec] : 87.49360725484475

This should get you up and running with PyTorch. If, for whatever reason, you want to get TensorFlow working, I can dig up my notes. But, be warned that it seems ROCm support for TensorFlow is in a much worse state than PyTorch. For example, you have to install the nightly release. The mainline TensorFlow release will completely fail to run on the 7700S.

Addendum:

  • I should credit this blog post for some ideas that helped me get my setup working. Note that their setup was smoother/simpler because they’re using an officially supported GPU (the desktop 7900 XTX), and they’re running a much older version of ROCm (5.5).
2 Likes

Thank you for the detailed information!

In case anyone else finds this - here are some notes I had when going through it. YMMV, and I’m going to edit as I run through it all:

  1. In terms of use-cases, I’m using pretty much all of them, as I also want to get CuPY working for Spacy and figured “why not”. If I find that I will regret this ( likely ) I will put additional information here.

  2. I had to specify 6.0.2 for installation to work with Ubuntu 22.04 and the associated version of the amdgpu-install tool that AMD provides.

1 Like