[TRACKING] Request: verify dGPU support

The key is let the dGPU storing KV cache, which could super accelerate the inference process.

Without a dGPU help, the iGPU perfomance similar to M4Pro, you can check my friend test Strix Halo (395)本地运行LLM测试 | David Huang's Blog

Our experiment want to make an affordable and usable solution to deploy llama4-400b, qwen3-235b and ds-671b without quantization too much (to avoid reduce quality) for person or small branch.
Although w7900 is expansive, but it’s much cheaper than compute GPU, and the large memory can help to support large context.

When I get Framework Desktop, I want to build a cluster to test DeepSeek-671b

2 Likes