I currently use GitHub Copilot integrated into VS Code for the majority of my coding related tasks (Claude Sonnet and GPT Codex models current most used).
I want to supplement this setup with locally running, coding oriented LLMs which I can integrate with the Claude Code VS Code plugin.
I want to be able to switch between Copilot and Claude, and in doing so move between cloud and local models as needed.
I am running models using Llama.cpp and configuring Claude Code in VS Code to use a locally running server:
Hi! I tried testing some models on my Framework 13 using LMStudio. I also tried local autocomplete (vscode) — almost no suggestions (???). Overall it works, GPT OSS really feels like a chat, but it’s not super clear what I’m supposed to do with all of it (I’m a developer).
I think the most useful thing would be a tool that, over 12 hours of computation (while I sleep), thoroughly searches for bugs based on a description (haven’t tested anything like that yet).
I’m curious — what kind of tasks are you all using this for?
I have a GitHub Copilot subscription and I use Claude and Codex for most coding tasks. It’s hard to drag myself away from that tooling when the quality of the output is so high.
I have also recently started using OpenSpec, which is brilliant in combination with those tools.
I want to find a use case for the local models, maybe for simpler, targeted refactoring tasks?
It may be possible to use something like Sonnet/Opus with OpenSpec to create the proposal, spec, design and task list and then switch to a local model to execute the tasks (use the best models to plan, the cheap models to implement).
I am finding myself using local models integrated into applications to power end use features more than for development, confident this can change as the models evolve.
I created a somewhat related thread, and this reply is potentially useful for the coding use case: