I’m on the waitlist (batch 10) for the 128GB Framework Desktop, but I’m trying to compare it realistically with other options - particularly the Mac Studio M3 Ultra (96GB).
I’ve looked at benchmarks and charts, but what I really need is a sense of how these machines compare in real-world usage. My main use cases are:
Generative AI (LLMs)
Transcription (Whisper)
General business use
Current setup: I’m juggling a MacBook Pro and a small Lenovo Windows PC. I recently sold my custom AI workstation, expecting to move to the Framework Desktop, but now I’m wondering if the Mac Studio might be the more ‘seamless’ option, since it’d save me switching between Windows and macOS.
The issue: The Mac Studio M3 Ultra (28/60, 96GB) costs around 70% more. Is the Framework Desktop’s performance-to-price ratio such a strong deal that it outweighs the Mac Studio’s ease of use?
Compactness is a key factor for me, and I need the system to run the local LLM due to data protection - cloud is not an option here.
Would love to hear your thoughts from anyone who’s used either (or both) in similar workloads. Maybe I’m even overdoing it on the RAM side of things - I’ve always just been under the impression that more RAM is always better.
On paper M3 Ultra is over 3 times as fast in memory bandwith at 819 GB/s versus 256 GB/s. In real terms its 1.5-2 times faster at LLM inferencing according to the video below.
For general use the Framework Desktop fantastic, you can take a look at Geekbench numbers. Or for more comprehensive testing: https://www.phoronix.com/review/ryzen-ai-max-395-9950x-9950x3d/11 - basically it’s neck and neck w/ a desktop 9550X at lower power to boot - really impressive.
Whisper performance on RDNA3 is pretty disappointing. It’ll work, but about on par w/ an RTX 3050 in my testing. I have no idea how a Mac compares for Whisper, however.
For LLMs based on llama2-7b q4_0 llama.cpp benchmarks for Mac vs kyuz0’s Strix Halo tests tg128 (token generation) is about 2X faster on an M3 Ultra, but pp512 (prompt processing) is actually almost exactly the same. You might get some more benefits from running MLX on the Mac.
Personally, if it were me, I’d keep my preorder, but also keep an eye out for if there’s a high memory Mac Mini being released in Oct/Nov (GPU compute on the Mac has been traditionally very underwhelming, but the M5 finally introduces tensor cores), and also if the Nvidia DGX Spark actually finally ships soon as well…
(More memory is definitely better and a good spot to be able to run quants of all the 100B-parameter class MoE models popping up recently.)