Anyone know how the VRAM allocation will work? Is it BIOS controlled? Or can it be controlled somehow at runtime (in Linux).
I ask because for doing machine learning stuff I’m curious to see the performance of using the integrated GPU with a lot of VRAM allocated to it (as in > 70 GB, I think the AMD Framework 13 can support up to 96GB of DDR5)
The ram should be dynamically allocated as needed between the CPU and iGPU (I think it might be limited to only allocating up to half of the dynamic ram to the iGPU, but I’m not sure about).
Some portion can be dedicated to the iGPU because some programs can’t handle dynamically shared VRAM, however most of those programs don’t need much dedicated VRAM so a lot of laptops don’t allow more than 256-512 MB to be dedicated to the iGPU in the BIOS.
I am aware, I dabble in localllama, which is what prompted me to make this post. More VRAM will at least let me load a 70B model (although with some quantizations I can load it on my 24GB desktop GPU today), but this in theory lets me test larger models, even if the tokens/second are a lot slower. My hope was that it’d be at least faster than using the CPU + RAM alone, if this were possible.
There actually are two settings but not clearly labeled.
I’m not at my Framework laptop right now but there was something along the lines of UMA_AUTO set as default, with the alternative option of UMA_GAME_OPTIMIZED. Auto dedicated 512MB to VRAM (out of the 64GB I had installed). Game optimized dedicates 4GB out of 64GB. Not sure if there are other changes associated with this option in addition to the VRAM though.
I’ve been wondering the same thing. Would be nice to have that support in the bios.
It seems like it’s true that it will dynamically allocate ram for the GPU as needed (or up to half of RAM maybe?), but a lot of the tools query the availability up front and fail.
On the plus side, enough people are interested in APUs that they’re working on shared ram issues. I saw this the other day:
The code is tiny, and saw one report that it worked on a 6800hs, but it’s definitely a bit of a hack, and specifically for pytorch.
I wouldn’t want to count on framework adding these options to the bios, so probably gonna have to cross our fingers that more support for shared ram gets more code soon.
I’m still unsure if the max dynamic allocation is half of system ram, or if that also gets changed on individual laptops, and I already feel like I’m making a lot of tradeoffs to support framework haha
Max dynamic allocation is half, which IIRC is a limitation of either the drivers or the OS (can’t remember which). I’ve seen people mention there’s a workaround to allow more, although I haven’t found it.
Edit: Here’s some discussion about this (including how to override it on Linux).