When I start the ollama service on my Arch Linux install, and examine the status, I can see the ollama service looking for GPUs. the message I get is to visit AMD.com and install linux-drivers, as no GPU is found.
The messages of unsupported Radeon iGPU and no compatible amdgpu devices detected is a little frustrating to see as I bought the computer thinking it could do some AI computing, but apparently, it cannot even run ollama with the GPU!!
What is the status of this in the eighth month of 2025? Am I still to wonder about if and when this machine will ever be compatible with running an LLM model on it?
In the meantime, if I cannot use ollama to download and run LLM models, is there any other alternatives that people are doing with these little machines? I am curious as without the GPU, whenever I run even a smallish model of say < 10GB in my 64GB ram space, the machine just fires up the fans, maxes out most of the CPUs and spends 5 minutes smoking itself to death trying to answer the simplest of prompts like “hi, how are you?”…
Thanks for the info, I feel like puking now. I shoulda returned this useless brick when I had the chance! It sucks being a person with the experience where I let frivolous false advertising cloud my judgement and wasted money, conceptually anyway. I can still use this for word processing at least, maybe juggle the odd spreadsheet.
The trick with llama.cpp was to learn. First, I naively installed it and tried running it. That worked to the point where it accepted input, and then blew up at the RocM not support for gfx1150. OK! We got somewhere!
Then it was clear I could instead compile llama.cpp without GPU support, and then I tried that and fed it input, and voila, I had it accepting input and generating output, so technically, even though it was pinning 12 CPU cores and 24 threads with compute business, the AI in the AI framework did something! Glory!
Then I went all out crazy and recompiled llama.cpp with VulkanSDK which is a way of moving processing on AMD Radeon GPUs to AMD Radeon GPUs, and this I tested with llama.cpp. Now the input I fed llama.cpp generated output and the entire processing effort pinned the GPU of the AMD AI chip to 100%.
I am not sure that there qualifies this Framework13 HX370 with the unsupported M890 Radeon as now supported, but it made me feel better that instead of pinning 12 CPU threads to answering a simply prompt for action, I pinned the GPU.
So in a way, I got what I asked for to tide me over until either of two things (maybe 3)
this turns out to be a slow dumb alteration of the universe
it is valid and AMD never supports the M890 and thus this is as good as it ever gets
AMD one days supports the gfx1150 and M890 and once that is added to the drivers, I get even better allocations between CPU and GPU.
I am happy to share and discuss this concept further. It seems relevant to anyone that bought into the myth of the Framework AI ready Ryzen HX300 series.
Agreed, this machine doesn’t live up to the early promises, but it’s not bad.
I don’t know if this makes things better or worse, but the AI 300 series is mostly constrained by its meager RAM bandwidth (90GB/s)… Even if ROCm was instant, it’s probably not going to make tokens/sec more than 20% faster.
No matter where you’re running the layers (Vulkan/ROCm/CPU), the DRAM is running reasonably close to its theoretical limit. (looking on the bright side: this is a quick CPU!)
The Desktop is a step in the right direction … hoping to see way more lanes in future laptops too.
It’s tough to lay this at the feet of Framework, as it’s a software issue. My simplistic solution is running JeffSer’s OLLAMA frontend Alpaca, which hasn’t worked at all since he separated Ollama out of his frontend. I have Ollama and the RocM plugs and RocM and Alpaca running simultaneously, but it’s all ships passing in the night. Which I interpret to mean: the software side of things is still a flaming disaster that only devoted devs/hobbyists can make functional. Which is fine, and I’m glad that they’re pushing things forward for the rest of us, and I support them where I can. But for the tech savvy user, AI software is still likely years away. I use Fedora, so I’m curious to see what IBM can push forward as well, though I am strongly in favor on local inference calculations vs remote. I want hardware with an AI coprocessor, but I think that it’s smart to wait until the software demonstrates functionality before spending that money.
The things I lay at the feet of Framework are things they possess the ability to deal with. Communications. Write up a little story about how best to harness this marvellous AI300 series chipset. Tell us what it can do today, and what it might do tomorrow. Obviously they don’t have to tell stories to sell hardware, but it sure would help, as we devoted tech nerds that have been computing since 1970s could always use a good chuckle before parting with $4000 on something something that might do something but probably does not do something something, AI…
You’re right though, this is not a vicious gripe, and the same problem exists at even the goldenist of the golden companies today, they are great at sales without sharing much in the way of substantive stories about how their products improve our lives. They leave that discovery to us, fair enough.
The issue is, I don’t think that Framework knows, because even AMD, IBM, and nVidia don’t know. What does the enthusiast CPU inference hardware look like in 2030? I bet you could make a lot of money at any of those companies if you could convincingly (maybe not even correctly) predict that. TOPS? How many? Shared memory? GPU/CPU/TPU into an APU? Does that use cache, standard DIMMS, on chip RDRAM or all of the above? How much of each? What balance of lower power (0?) vs full fat cores? How much GPU processing power? Is resolution high enough that anti-aliasing is no longer needed? Does AI frame generation decrease the need for GPU power and free up watts/die space for the TPU?
I guess my point is that the very tip of the spear is flying pretty blind right now, so that puts OEMS like Framework in the bind of “we can’t in good faith release a product that we aren’t sure will work 100% as consumers expect” vs “here’s the best that the best can make, we packaged it as best we can and hope it works for you”. Neither stance is perfect, and neither one is gonna make everyone happy.
Agreed, but Framework needs to be more careful copying AMD’s empty promises into their promotional material. The AI 300 intro blog post said (or at least hinted strongly) this would be a good platform for ROCm. wrong!
I think the Mac Studio shows us what workstations will look like in 4 years: unified memory with massive bandwidth through extreme parallelization. The RAM will be soldered down, because, even if there was room to make thousands of individual high speed connections, that’ll be too fiddly to be reliable. The Desktop is a first baby step in this direction.
I claim Apple is committing a crime by soldering down its storage, but I just don’t see a way around soldered memory in high performance computers.
As far as gaming, dunno. Rendering architectures seem to be stagnating. It’s like GPU designers are distracted by something else these days?