/

Artificial Intelligence

/

Machine Learning

Ollama + VS Code: Your Guide to Local LLM Development

All Thought Leadership

Evan Sanning

July 1, 2025

Subscribe

Page Contents

Running local Large Language Models (LLMs) within VS Code just got a lot easier thanks to Ollama. But why do we want to run a local LLM for development? Great question! Let me explain.

First, we have privacy concerns; a local LLM means no sharing your queries and data. Speed is also a reason. Using your own hardware means you don’t have to share bandwidth with everyone else in the cloud. On top of privacy and speed, a local LLM works offline, so you don’t have to rely on the internet (no AWS downtime for you!).

Need more? Local LLMs allow for a greater degree of customization; it’s easier to play around with different models to find the perfect solution for your team. And last but certainly not least (especially for leadership and stakeholders), running a local LLM is typically a more cost efficient option because you avoid the sometimes very steep costs associated with many cloud-based services.

As you can see, local LLMs for development certainly have their perks! In this guide, we’ll show you how to leverage Ollama, a popular tool for local LLM execution, and the Continue VS Code extension to create a powerful coding environment. Follow along, and you’ll be running AI-powered code suggestions in no time!

Installing Ollama and Downloading your Model (Let’s Get the Party Started!)

Installing Ollama is surprisingly straightforward. It’s like building a tiny, helpful robot that lives on your computer (also accessible via a server). Follow the official installation instructions for your operating system.

Installing the Continue Extension within VS Code (Our New Best Friend)

Now for the fun part: integrating this power directly into your VS Code environment. That’s where the Continue extension comes in. Think of it as the bridge between Ollama and your VS Code editor, allowing you to summon AI assistance with a few keystrokes.

Installing the extension is just like installing any other VS Code extension. Open the Extensions view, search for “Continue,” and click “Install.”

Setting Up Continue Extension (Connecting the Dots)

Alright, with Ollama and the VS Code Continue extension ready, we’re getting close to unleashing the coding superpowers necessary for local LLM development!

Chat

Click the Continue extension that is now in your sidebar, which will bring up the Chat interface. This is where you can have a conversation with the LLM, giving it context for files or projects as needed.

From here, click “Select model” -> “Add Chat model.”

Choose “Ollama” as the Provider. Then, we are going to select “Llama3.1 Chat” as the Model (or whichever model you want). Finally, press “Connect.”

The next step is to enter something into the Chat box. I’m going to enter using python check if number is prime, do not explain the code. This will prompt you that the model needs to be installed, so click “Install Model.”

Now that the model has been installed, our LLM will respond when you press send on the Chat for our query. In the screenshot below, you can see the spike in my GPU being utilized.

Agent Mode

Within VS Code Continue + Ollama, we also have Agents (depending on the model) that can be used for your local LLM. Agents allow for features like creating files. Let me demonstrate.

From the Chat window, click the “Chat” dropdown and choose “Agent.”

From here, we are going to ask to create a Python main.py file to check if a number is prime. It then has a “Create file” option, which will create the file for us.

Let’s pretend for a second that we don’t know what this code is doing. Highlight the code and press Ctrl + i, and ask it to explain what the code does. Here’s the result:

As you can see, it added some comments for our code, which we can choose to either reject or accept.

Pretty simple. Let’s try something a little bigger now:

Can we modify this in order to run indefinitely? Write to a text file for all found primes, and write to a separate file to show the last number we checked. I want this to be able to be stopped and started so it picks back up where we left off.

Not bad! Now, I’ll never suggest using AI for everything. It’s a powerful tool, but like any tool, it works best when paired with thoughtful human judgment. We should always strive to challenge ourselves, understand what’s happening under the hood, and avoid becoming overly reliant on automation, especially in our field.

That said, experimenting with local LLMs like this is a great way to explore their strengths and limitations firsthand. And like any new setup, we ran into a few quirks and edge cases along the way. Here are some of the issues we had to work through to get things running smoothly:

Agent not accounting for text files not existing.
Agent creating text files manually (in the wrong folder).
Agent applying code into text file instead of main.py.
Evan copying the same Permission denied: '/last_checked.txt' error into the chat with no result. Wondered what was going on until finally using the ol’ hamster wheel and opening the terminal as Administrator.

Wrapping It All Up

With Ollama and the VS Code Continue extension, integrating a local LLM into your development workflow is surprisingly approachable… and powerful. You get all the benefits of AI-assisted development without sending your data to the cloud or racking up usage-based costs.

Sure, it may take a bit of trial and error to get things running just right, but once you’re up and going, it’s a game-changer. You’ll enjoy fast, private, and customizable AI support, right within the comfort of your favorite IDE.

If you’re looking for a more secure, responsive, and cost-effective way to experiment with LLMs, running them locally with VS Code and Ollama is a great place to start.

Curious What’s Next?

At Keyhole Software, we’ve been actively exploring ways to integrate AI tools like these into real-world software delivery. Whether you’re looking to modernize your development workflow, explore use cases for AI-assisted coding, or just want to bounce around ideas, we’re here to help.

Let’s talk about how we can support your team as you explore what’s possible with local LLMs.

Agentic AI & AI-Accelerated Development

Articles

Artificial Intelligence

Machine Learning