What AI/ML Use Cases Should We Demo?

wolfie · May 10, 2025, 12:09am

AFAIK it’s 96GB in Windows, 110GB in Linux.

Djip · May 10, 2025, 7:48pm

With correct api, there is no limit on Linux (well only the size of the RAM: 128Go minus RAM neaded for other active programme.. )
This is the case for other AMD APU too (when the driver did not crash )

8 or 4 Go may be good for OS it leave 120/124 Go for the LLM tensors.

Djip · May 10, 2025, 8:06pm

For AI/ML Use case may be reporte result for https://www.localscore.ai/ with different model.

3 config is interesting CPU, sgemm GPU, full rebuild HIP.
Q6_K is fast and with good quality (Q4_K_M is good for bench but have to much hallucination for me…)
bartowski (Bartowski) have many model pres quantized, may be a selection of different model size from Lllama 3B to mistral large 123B ..

There is bf16 CPU Perf with llama.cpp too that may be interesting to have a look.

If needed I can give some more specific command to run for different cases.

I did not know if I can finish the FP8 backend of llama.cpp but if you have time I am working on a special IGPU backend for llama.cpp, for now only FP16/BF16 is supported and is optimised for the Ryzen 7940HS iGPU that have 12 CU… did not know what is the best/correct config for this 40CU … (GitHub - Djip007/llama.cpp at feature/igpu) and it may need to use the rocm-6.4 from fedora-43 (beta) …)

and for the “cluster” some bench with big MoE like the Llama 4 Maverick (if possible…) or the smaller Mistral 8x22 (in bf16 quant?)

n392 · June 3, 2025, 3:12pm

Yes, I would like to know that as well. I really need some hard performance figures before spending that amount of money.

Djip · June 4, 2025, 11:41pm

A small llama.cpp bench (it is the start we can get better… )

gcf · June 8, 2025, 1:34pm

You’ve probably seen this, but here’s an AMD benchmark (on Flow13 GZ302) trying to highlight the benefit of 128GB over 48GB memory, where

During testing, the M4 Pro 48GB was observed to rely on swap memory, which significantly slowed its performance.
AMD Ryzen AI Max+395: A Leap Forward in Generative AI Performance with Consumer PC

Stable Diffusion 3.5 large fp16 has 8.1 billion at 2 bytes for fp16, so seems like it should fit into 48GB or even 24GB. So maybe the bottleneck in the first test was not memory capacity. So the swapping probably occurred only in the second test which concurrently ran Stable Dffusion + Phi4, but didn’t explain the benefit of doing so.

Strangely, DeepSeek-R1 70B is mentioned in the configuration and footnote but not in the text or charts. (Speculation: Maybe AMD tried it and didn’t find a benefit in that case. Further speculation: That might be related to the DeepSeek claim that the 671B version uses only 37B of the 671B parameters for any one token. So for AMD’s test prompt, maybe it didn’t need to swap in the entire DeepSeek model, just a small fraction? I don’t know.)

Joe_Name · June 10, 2025, 11:19am

I’m all for AMD making great headway in the AI world, it’s needed. I’m not a fan of cherry picking tests to skew results. AI will suffer horribly when there isn’t enough RAM. It’s more a marketing showcase than a rigorous, unbiased benchmark.

edf · June 26, 2025, 3:09pm

In addition to performance benchmarks, how about some compatibility demonstrations?

In particular, provide sample code that will load a model using PyTorch and Transformers/Accelerate - you know, the standard HuggingFace setup. Ensure that the code runs on Windows and Linux.

That would show that the machine - and, in particular, the GPU - is a solid consumer choice for running AIs. The reason this is important is that, so far, Nvidia cards are the only reliable such choice. AMD has a few cards which can be made to work, but my experiences with the one in the Framework 16 indicate that these require a degree of fiddling that a non-specialist is unlikely to put up with.

Kim_Ake · July 2, 2025, 11:29am

I’m an audiovisual creator, and have been training a lot of image models and lately also audio with Stable Audio Open. First of all I’d like to see even proof that basic things like training/ inference on Stable Audio Open works, as well as doing a full checkpoint training on Flux Dev (not possible even on a 40GB A100) and also just in general the capability to run ComfyUI workflows with some common custom nodes. So obviously this depends on RoCM etc, but in my case bracketing for good training values get expensive as such, so it would be great to be able to try out values locally and only then commit for a full run in the cloud.

Vinny_Nguyen · July 6, 2025, 12:08am

I want to see out of the box LM Studio use cases. The way to get ahead of the inevitable reviews and hearsay is to provide benchmarks of larger but still GPU viable models(32GB and lower) with some GPU comparisons. Models larger than 32GB with a GPU in addition offloading some layers, and of course the same model with just the framework desktop.

If the facts aren’t out there before units ship, these will be some of the tests I would do to decide if I’m better off returning the desktop and spending on GPUs instead.

catastrophic · July 7, 2025, 7:09pm

We just posted a blog around this, let us know what you think! Using a Framework Desktop for local AI

Laur_Ivan · July 15, 2025, 12:22pm

Daisy-chain multiple desktops for large models (200B+).

ParticleCannon · July 15, 2025, 7:37pm

About that, the AI SOC for Nvidia uses specialized networking to accomplish this (either a ConnectX 7 or 8). Was the plan for multiple Framework Desktops to be clustered just with ethernet or pooled with some kind of PCI-E mesh/fabric? I don’t think 5GBASE-T is gonna cut it for models and environments that need coherent pools of VRAM.
Three groups of 4 PCIE lanes exist that could be used, either through the NVME slots or the unused x4 slot. I think those top out about 60gbit. Or maybe through 40gbit USB4?

Laur_Ivan · July 15, 2025, 8:36pm

I’m curious what AMD says about this use case.

Adrian_Joachim · July 16, 2025, 8:40am

Point to point usb4 networking is probably the easiest, all you need is usb4 cables and it’s somewhat plug and play. You aren’t getting 40gbit but definitely a lot more than 5.

Kinnex · July 31, 2025, 1:05am

I’d like to see the benchmarks as well. Especially if they could use Nix to install and run the benchmarks. That would potentially allow you run them on any linux distro with Nix installed and possibly on MacOS. Maybe even under WSL on windows, not sure how that would effect performance though.

Djip · July 31, 2025, 9:24pm

it will be hard to have real access to the GPU if possible.

Topic		Replies	Views
Using a Framework Desktop for local AI Blog	8	1980	July 23, 2025
Molecular Rendering, Video Game Development - Can the desktop AI Max+ 395 do it? Framework Desktop	30	724	June 15, 2025
LLM Performance Framework Desktop ai	26	4264	June 11, 2025
Introducing the Framework Desktop Blog	76	10352	June 16, 2025
Help Me Make Up My Mind (FW13 Ryzen AI 9 HX 370) Framework Laptop 13 framework-laptop-13-amd-ai-300 , ai	18	1927	July 11, 2025

What AI/ML Use Cases Should We Demo?

Related topics