FW13 7640u 760m only uses half of available RAM as VRAM in LM Studio under Windows 11

Eric_Orso · May 5, 2025, 6:35am

I’m using LM Studio to run LLM on my FW13 with 32GB of RAM.

llama.cpp vulkan runtime.

As long as the model uses les than half of the ram (16GB) everything works fine, but when a model exceed that, it starts using swap memory (can be seen by maxing out disk speed) and failing.

It feels to me there is a software limit that prevent the radeon 760m from requesting more than half RAM as VRAM, is there a way to unlock this limit and increase VRAM allocation?

The reason I’m trying a larger model is that Qwen 3 30B A3B is a MoE model that only has 3B active parameters, and should result theoretically in much superior speed than a 14B model. E.g. On my desktop the MoE model achieves 80 tokens/s. I suspect that if I can load it with 32GB of ram, I can boost the speed and use a bigger model.

Topic		Replies	Views
VRAM allocation for the 7840U frameworks Framework Laptop 13	27	10891	August 13, 2024
Extra Vram on framework 13 Framework Laptop 13	6	2745	December 17, 2023
FW13 AI 370 performance? Framework Laptop 13 framework-laptop-13-amd-ai-300	26	526	June 4, 2025
Pushing the VRAM Limits? Framework Laptop 13 windows , bios , framework-laptop-13-amd-7040	1	149	May 23, 2025
How do I set a specific VRAM amount on 13" AMD Laptop Community Support windows , bios	17	2284	May 26, 2025

FW13 7640u 760m only uses half of available RAM as VRAM in LM Studio under Windows 11

Related topics