Framework Desktop Deep Dive: Ryzen AI Max

We dedicated a lot of our launch presentation of Framework Desktop to the Ryzen AI Max processor it uses, and for a good reason. These truly unique, ultra-high-performance parts are the culmination of decades of technology and architecture investments that AMD has made, going all the way back to their acquisition of ATI in 2006. For our first technical deep dive on Framework Desktop, we’re going to go even deeper into Ryzen AI Max and what makes it a killer processor for gaming, workstation, and AI workloads.

What makes Ryzen AI Max special is a combination of three elements: full desktop-class Zen 5 CPU cores, a massive 40-CU Radeon RDNA 3.5 GPU, and a giant 256-bit LPDDR5x memory bus to feed the two, supporting up to 128GB of memory. Chips and Cheese did an excellent technical overview of the processor with AMD that goes even deeper on this, and we’ll pull out some of the highlights along with our own insights. We’ll start with the CPUs. Ryzen AI Max supports up to 16 CPU cores split across two 4nm FinFET dies that AMD calls CCDs. These dies are connected together using an extremely wide, low power, low latency bus across the package substrate. The CPUs are full Zen 5 cores with 512-bit FPUs and support for AVX-512, a vector processing instruction set otherwise only available on Intel’s top end server CPUs. We’re excited for you to see the multi-core performance numbers these CPUs can do in our upcoming press review cycle!

The GPU in Ryzen AI Max is discrete-class, with 40 RDNA 3.5 Compute Units in the Radeon 8060S configuration. For reference, the discrete Radeon 7700S GPU in Framework Laptop 16 has 32 RDNA 3 CUs. The GPU sits on a separate, even larger 4nm FinFET die from the CPU CCDs. This die also carries the large NPU, video encode/decode blocks, 32MB of additional MALL cache, and the memory and peripheral interfaces. The GPU handles essentially all current PC titles well at 1080p with high graphics settings, and most at 1440p as well.

To feed a GPU of this class, the processor needs a ton of memory bandwidth. Mobile and desktop processors like the Ryzen AI 300 Series used in Framework Laptop 13 top out at 128-bit memory buses, and Ryzen AI Max doubles that to 256-bit at 8000 MT/s, enabling a massive 256GB/s of bandwidth. That is similar to the throughput that the discrete 7700S GPU achieves. With eight 32-bit memory packages, the processor can support a colossal 128GB of LPDDR5x. On Windows, up to 96GB can be dedicated to the GPU, and we’ve seen even higher numbers on Linux, making Ryzen AI Max excellent for AI workloads. We’ll have a dedicated deep dive on the AI use case soon.

One tradeoff on the memory though is that fanning out that giant 256-bit memory bus requires the LPDDR5x to be soldered. When we learned about Ryzen AI Max, our first question for AMD was whether using LPCAMM2 was possible to modularize the memory. Instead of immediately saying “No, it’s not possible,” AMD allocated technical architects and engineers to spend days testing out different layouts and running simulations. They then finally concluded that it was in fact not possible without massively downclocking the memory, which would defeat the purpose of having a wide memory bus and large GPU. We accepted the tradeoff of using soldered memory, and unlike some electronics brands, aren’t using that as an excuse to charge obscene sums for higher memory capacity.

What makes Ryzen AI Max especially interesting in the Framework Desktop is that we were able to unlock every bit of its power. Because we use a desktop-style 6-heatpipe heatsink from Cooler Master and a 120mm fan, we can run it at its maximum sustained power of 120W along with 140W boost, while keeping the system quiet. We were also able to break out 2x USB4, 2x DisplayPort, HDMI, and all three PCIe x4 interfaces, two for M.2 SSDs and one as a x4 PCIe slot. All of this makes it great in the tiny Framework Desktop form factor, but also makes it excellent to drop the Mainboard into any standard Mini-ITX case. This is, after all, a standard PC! It’s just one that uses a one-of-a-kind, monstrous processor from AMD. Pre-orders for Framework Desktop are open now, with new orders shipping in Q3.

5 Likes

I’m excited to see how this performs with models like the new Llama3.3 model and QWQ. I’m growing more and more interested in local AI the more I play with it.

Why don’t they (AMD) call it L4/5 cache [faster], and give us upgradeable memory [relatively slower]?

That’s how ‘memory’ has been dealt with all these years…faster-closer-smaller size, slower-farther-larger size.

(L3 cache was introduced in 2003)

The price would be more than most people could pay…

Consider how expensive Intel Optane was, and consider what cache sizes we’ve seen with AMD’s L3 3D V-Cache (96MB for 1 CCD) and Intel’s Broadwell (128MB L4).

I don’t know if it’s even possible to have 128GB on die/chip and maintain the same bandwidth as current L3/L4.

Memory bandwidth. Apple’s silicon is seeing the impressive performance that it does because of the bandwidth that is possible on their SoCs. This is not a conspiracy to get rid of socketable ram. It simply is the nature of the beast for attaining this kind of performance on AI based workloads.

2 Likes

I probably didn’t describe what I was thinking very well:
Keep L1-L3 cache on-die.
Keep the current on-chip/on-SOC (not on-die) memory, give it faster bandwidth like Apple sillicon, call that L4 cache (or whatever marketing label). This is the 32GB - 512GB non-upgradable memory.
Have the cache / memory hierarchy also reach out to off-chip / on-board / DIMM memory. This will be the current DDR5 sticks.

I’m butchering the needed complexity here, but it’s to introduce an additional layer in the memory hierarchy.

Yes, cost is definitely one of the factors to be considered.

Like, over the years, they’ve been slipping in additional layers between L1 cache, on-board memory, and disk storage to get a smooth transition from one layer to another, to close the performance gap / step size between layers.

1 Like

I’m in the market for a small-factor desktop and would love to buy the Framework Desktop, being a happy user of a first-generation Framework 13.

However, I’m not interested in soldered memory. I understand the tradeoff, but that’s just not for me: upgradability and repairability is my priority.

Does anyone know if Framework is planning on releasing alternative motherboards that are entirely modular and reparable?

Nobody knows this. But its pretty clear that atleast this generation won’t see anything like that. Maybe the 2nd gen version will have something like that, depending of course if there is something that could provide same kind of bandwith in the coming years.

1 Like

I means most standard mainboards are already more modular than the Framework one. And what would you expect when you call a mainboard repairable? All mainboards are repairable if you can do micro soldering, but you can’t really build mainboards where stuff isn’t soldered.

I am hoping that AMD decides to update this APU with modern specs. The idea of just 16 PCIe lanes and PCIe 4.0 to boot, is just obviously a downgrade they decided on to not compete with other of their products. But given current markets, I want to see minimum of 24 pcie 4, (prefer 32 pcie 5). Almost all the other specs are what I consider a home run, except for the kneecapping of the pcie number/type. I would like to also see a 48 or 56 CU monster iGPU. Call this the 397+. So everything else the same.
I want to see this so I can have an itx box that can also breakout a full 16x slot, with an extender to the outside of the case, so I can mount an AI card or GPU on the outside directly connected to the PCIe. The more beefy GPU would be just nice, with a slightly more powerful NPU too.
This would be my dream box. fitting on a framework desktop itx case.
I would be very very happy
BUT, I am currently very happy already with this APU and the Framework design
(still AMD at least more PCIe lanes!!! please)

Thank you for this, this is the dream of the APU when it was first introduced, and it finally came to be.

Please consider adding either Occulink, or something similar, to the bottom of future 13" motherboards, to enable docking with a dGPU. I don’t really do much that is CPU intensive anymore, so the ability to dock my laptop, so I can use it for editing/encoding videos faster, and so I can use it for gaming, when I’m home, would enable me to only have my laptop for everything I do. More modularity and upgradability = less overconsumption and e-waste… which seems to be your mission :grinning_face:
If you made a GPU-dock to go with it, that’d be brilliant!

I do realize that that would require a new bottom shell, for existing customers, that would like to have access to such a port in future motherboards, but it’s definitely worth it in my opinion.

While I am not sure if the bottom is the best place I agree, please expose the 8 unused pcie lanes somewhere (and somehow, doesn’t necessarily need to be occulink) in the future.

I was thinking maybe it is possible to do something like this to one of the expansion card slots:


I am pretty sure there is enough space to break out at least 4 lanes if not 8 beside the usb-c connector.

Would make the new board still fit old case, new case would only need some extra holes. It’s also probably somewhat cheaper than putting the redriver on the board for people that don’t want to use it and have it all on the expansion card.