How to Install Ollama on Mac, Windows, and Linux

2026/04/01

Beginner10 min

How to Install Ollama on Mac, Windows, and Linux

Step-by-step guide to installing Ollama on macOS, Windows, or Linux and running your first AI model locally in under five minutes — no GPU required.

Ollama is the fastest way to run AI models on your own computer. This guide covers installation on all major platforms.

Prerequisites

Before you start, make sure you have:

8 GB RAM minimum (16 GB recommended)
10 GB free disk space
A modern operating system (macOS 12+, Windows 10+, or Linux)

Installation

macOS

The easiest way to install Ollama on Mac:

Download Ollama from ollama.com/download
Open the downloaded .zip file
Drag Ollama to your Applications folder
Launch Ollama

Or via Homebrew:

brew install ollama

Linux

Install with one command:

curl -fsSL https://ollama.com/install.sh | sh

For specific distributions, check the official docs.

Windows

Download Ollama from ollama.com/download
Run the installer
Follow the setup wizard

Running Your First Model

Once Ollama is installed, open your terminal and run:

ollama run llama3.2

This will download the Llama 3.2 8B model (~4.7 GB) and start a chat session. The first run takes a few minutes to download the model.

Popular Models to Try

Here are some great models to start with, ordered by size:

Small Models (4-8 GB RAM)

# Great for basic tasks
ollama run llama3.2:3b

# Excellent for coding
ollama run qwen2.5-coder:7b

# Best reasoning at this size
ollama run deepseek-r1:8b

Medium Models (16 GB RAM)

# Best all-rounder
ollama run llama3.3:70b

# Great for multilingual
ollama run qwen2.5:14b

Using the API Server

Ollama automatically starts an OpenAI-compatible API server at http://localhost:11434. You can use it with any OpenAI client:

curl http://localhost:11434/api/chat -d '{
  "model": "llama3.2",
  "messages": [
    { "role": "user", "content": "Hello, how are you?" }
  ]
}'

Common Issues

"Out of Memory" Error

If you get an OOM error:

Try a smaller model (e.g., llama3.2:3b instead of llama3.2)
Close other applications to free up RAM
Check our models for 8GB RAM guide

Slow Inference

If responses are slow:

Make sure GPU acceleration is enabled
On Mac: Activity Monitor → check if "GPU" is being used
On Linux: ensure NVIDIA drivers are up to date

Managing Models

# List downloaded models
ollama list

# Delete a model
ollama rm llama3.2

# Update a model
ollama pull llama3.2

What's Next?

Install LM Studio for a GUI alternative
Compare Ollama vs LM Studio
Deploy Ollama on Runpod for cloud GPU access

Out of RAM? Deploy Ollama on Runpod with cloud GPUs from $0.20/hour.

Get started with Runpod for cloud GPU computing. No hardware upgrades needed — run any AI model on powerful remote GPUs.

Get Started with Runpod

Partner link. We may earn a commission at no extra cost to you.

All Posts

Author

Local AI Hub

How to Install Ollama on Mac, Windows, and Linux

Step-by-step guide to installing Ollama on macOS, Windows, or Linux and running your first AI model locally in under five minutes — no GPU required.

Ollama is the fastest way to run AI models on your own computer. This guide covers installation on all major platforms.

Prerequisites

Before you start, make sure you have:

8 GB RAM minimum (16 GB recommended)
10 GB free disk space
A modern operating system (macOS 12+, Windows 10+, or Linux)

Installation

macOS

The easiest way to install Ollama on Mac:

Download Ollama from ollama.com/download
Open the downloaded .zip file
Drag Ollama to your Applications folder
Launch Ollama

Or via Homebrew:

brew install ollama

Linux

Install with one command:

curl -fsSL https://ollama.com/install.sh | sh

For specific distributions, check the official docs.

Windows

Download Ollama from ollama.com/download
Run the installer
Follow the setup wizard

Running Your First Model

Once Ollama is installed, open your terminal and run:

ollama run llama3.2

This will download the Llama 3.2 8B model (~4.7 GB) and start a chat session. The first run takes a few minutes to download the model.

Popular Models to Try

Here are some great models to start with, ordered by size:

Small Models (4-8 GB RAM)

# Great for basic tasks
ollama run llama3.2:3b

# Excellent for coding
ollama run qwen2.5-coder:7b

# Best reasoning at this size
ollama run deepseek-r1:8b

Medium Models (16 GB RAM)

# Best all-rounder
ollama run llama3.3:70b

# Great for multilingual
ollama run qwen2.5:14b

Using the API Server

Ollama automatically starts an OpenAI-compatible API server at http://localhost:11434. You can use it with any OpenAI client:

curl http://localhost:11434/api/chat -d '{
  "model": "llama3.2",
  "messages": [
    { "role": "user", "content": "Hello, how are you?" }
  ]
}'

Common Issues

"Out of Memory" Error

If you get an OOM error:

Try a smaller model (e.g., llama3.2:3b instead of llama3.2)
Close other applications to free up RAM
Check our models for 8GB RAM guide

Slow Inference

If responses are slow:

Make sure GPU acceleration is enabled
On Mac: Activity Monitor → check if "GPU" is being used
On Linux: ensure NVIDIA drivers are up to date

Managing Models

# List downloaded models
ollama list

# Delete a model
ollama rm llama3.2

# Update a model
ollama pull llama3.2

What's Next?

Install LM Studio for a GUI alternative
Compare Ollama vs LM Studio
Deploy Ollama on Runpod for cloud GPU access

Out of RAM? Deploy Ollama on Runpod with cloud GPUs from $0.20/hour.

Get started with Runpod for cloud GPU computing. No hardware upgrades needed — run any AI model on powerful remote GPUs.

Get Started with Runpod

Partner link. We may earn a commission at no extra cost to you.

All Posts

Author

Local AI Hub

How to Install Ollama on Mac, Windows, and Linux

Author

Categories

More Posts

Apple Silicon LLM Optimization — Get the Most from M1, M2, M3, and M4

How to Run Llama Locally — Step-by-Step Guide for 2026

Cheapest Way to Run LLM — Local, Cloud, and Hybrid Options Compared

How to Install Ollama on Mac, Windows, and Linux

Author

Categories

More Posts

Apple Silicon LLM Optimization — Get the Most from M1, M2, M3, and M4

How to Run Llama Locally — Step-by-Step Guide for 2026

Cheapest Way to Run LLM — Local, Cloud, and Hybrid Options Compared