A Framework 13 Ryzen AI 9 success story

Hi everyone,

since I saw quite a few very emotionally charged postings about problems with the new model I wanted to share my story, because it seem people rarely post if stuff just works.

I am a happy frame.work User since the very first Gen11 model which I basically ordered a few minutes after pre-orders for my country had opened. I updated it last year to a Ryzen 7040 which chugs along happily. In the meantime I also got another 11th gen and a 12th gen at fair prices from eBay for my wife and kid. All of them run Linux and - apart from a strange sound issue on the Ryzen, which I procrastinated to investigate for months now, because I rarely ever need the builtin speakers - everything works flawlessly.

Since my work machine was an almost 7 year old DELL and I was long overdue for a replacement, I did now manage to have my company order a specced-out Ryzen AI 9 frame.work for me. It came with 96GB of RAM and 2TB of SSD.

I updated my EndeavourOS/KDE on the DELL, pulled out the SSD, cloned it onto the Frame.work SSD and after some minor UEFI signing key loading hassle, needed by secure boot to work, it came up and just worked.

I then deleted all nvidia drivers, installed everything AMD and ROCM and updated all firmware using fwupdmgr.

In parallel to Linux, on my Desktop, I am constantly running several docker containers as well as VMs with virt-manager, one of them a Windows VM and everything runs just buttery smooth on those 24(!) cores. I do have to boot a non-lts kernel, though - otherwise suspend will not work due to driver issues with the brand new hardware.

I assigned the maximum possible 32GB of VRAM to the GPU in the BIOS (I still need to research out how I can increase the amount beyond that) and got ollama-rocm (from AUR) to run by starting it like this:
HSA_OVERRIDE_GFX_VERSION="11.0.0" ollama serve

This setup runs many popular AI models with up to 32 billion parameters at acceptable speeds (for good results i deem 1 token/sec still acceptable). Smaller models like “neural-chat:latest” run fast enough for pretty natural conversation speeds. If you want to test local AI out with ollama, I strongly recommend to install the Firefox Extension “Page Assist” as an easy to use frontend. Maxing out the GPU for AI over extended time does generate significant fan noise, though, which might be an issue for more sensitive beings. Using a laptop stand with an integrated fan helps a bit. Also, the GPU draws an extra 25 Watts when running constantly, so you better run your LLM stuff on mains.

The HD screen that is coming with the latest high-end models is really nice, too. I ordered the half-transparent gray screen bezel which adds a nice, slightly technoid flair to the appearance of the machine. Initially, I was concerned that I might not like the slightly rounded corners, but in day to day business, I happen to rarely notice them at all.

My verdict after two weeks of use: the FW 13 with the AI 9 CPU and lots of RAM is an awesome piece of kit, providing LOTs of power in a pretty slim package. Looking forward to be happy with this for a long time!

6 Likes

Thank you for sharing! :orange_heart:

Hi, do you have any guides for running ollama on the xdna accelerator? I’d like to try it, but haven’t even started to look, and since you mention you run them, I thought I’d ask.

Wait wut? Ollama-rocm tells me explicitly this revision of GPU is not supported yet and defaults to CPU.

I am on Arch on the latest packages.

Setting the ENV variable (see my quote above) is required to override the internal GPU support check and use the GPU anyways.

ok got it working, only a couple models do actually use the GPU:

sudo systemctl edit ollama.service
[Service]
Environment=“HSA_OVERRIDE_GFX_VERSION=11.0.0”
then htop, use mistral for example and you will see 1 thread on CPU, and bunch of GPU workload.