Note however, that these LLMs created by AMD, not popular LLMs that you would expect an “AI” CPU to be able to run.
gfx1150 support (890M) not working · Issue #40 · likelovewant/ollama-for-amd
Interestingly, inference performance stays the same [when using a build of ROCm patched to support the APU], about 7.1-7.8 tok/s for
deepseek-r1:14b
q4 on this hardware (AMD HX 370, LPDDR5X 7500), when I run all of these variants:
- Stock Ollama CPU, num_threads=20 (it’s a 12-core + SMT CPU)
- llama.cpp with Vulkan
- llama.cpp with AVX512
- ollama-for-amd with ROCm
So, most likely it’s a memory bandwidth limitation. The CPU is free to do other things if GPU is used, of course.
So, I would not advise using the AMD Ryzen AI 300 series to learn about AI.