Local AI Hub
  • Compare Tools
  • Tutorials
  • Cloud Deploy
  • Blog
How to Install Ollama on Mac, Windows, and Linux
2026/04/01
Beginner10 min

How to Install Ollama on Mac, Windows, and Linux

Step-by-step guide to installing Ollama on macOS, Windows, or Linux and running your first AI model locally in under five minutes — no GPU required.

Ollama is the fastest way to run AI models on your own computer. This guide covers installation on all major platforms.

Prerequisites

Before you start, make sure you have:

  • 8 GB RAM minimum (16 GB recommended)
  • 10 GB free disk space
  • A modern operating system (macOS 12+, Windows 10+, or Linux)

Installation

macOS

The easiest way to install Ollama on Mac:

  1. Download Ollama from ollama.com/download
  2. Open the downloaded .zip file
  3. Drag Ollama to your Applications folder
  4. Launch Ollama

Or via Homebrew:

brew install ollama

Linux

Install with one command:

curl -fsSL https://ollama.com/install.sh | sh

For specific distributions, check the official docs.

Windows

  1. Download Ollama from ollama.com/download
  2. Run the installer
  3. Follow the setup wizard

Running Your First Model

Once Ollama is installed, open your terminal and run:

ollama run llama3.2

This will download the Llama 3.2 8B model (~4.7 GB) and start a chat session. The first run takes a few minutes to download the model.

Popular Models to Try

Here are some great models to start with, ordered by size:

Small Models (4-8 GB RAM)

# Great for basic tasks
ollama run llama3.2:3b

# Excellent for coding
ollama run qwen2.5-coder:7b

# Best reasoning at this size
ollama run deepseek-r1:8b

Medium Models (16 GB RAM)

# Best all-rounder
ollama run llama3.3:70b

# Great for multilingual
ollama run qwen2.5:14b

Using the API Server

Ollama automatically starts an OpenAI-compatible API server at http://localhost:11434. You can use it with any OpenAI client:

curl http://localhost:11434/api/chat -d '{
  "model": "llama3.2",
  "messages": [
    { "role": "user", "content": "Hello, how are you?" }
  ]
}'

Common Issues

"Out of Memory" Error

If you get an OOM error:

  1. Try a smaller model (e.g., llama3.2:3b instead of llama3.2)
  2. Close other applications to free up RAM
  3. Check our models for 8GB RAM guide

Slow Inference

If responses are slow:

  1. Make sure GPU acceleration is enabled
  2. On Mac: Activity Monitor → check if "GPU" is being used
  3. On Linux: ensure NVIDIA drivers are up to date

Managing Models

# List downloaded models
ollama list

# Delete a model
ollama rm llama3.2

# Update a model
ollama pull llama3.2

What's Next?

  • Install LM Studio for a GUI alternative
  • Compare Ollama vs LM Studio
  • Deploy Ollama on Runpod for cloud GPU access
Out of RAM? Deploy Ollama on Runpod with cloud GPUs from $0.20/hour.
Get started with Runpod for cloud GPU computing. No hardware upgrades needed — run any AI model on powerful remote GPUs.
Get Started with Runpod

Partner link. We may earn a commission at no extra cost to you.

All Posts

Author

avatar for Local AI Hub
Local AI Hub

Categories

  • Tutorials
PrerequisitesInstallationmacOSLinuxWindowsRunning Your First ModelPopular Models to TrySmall Models (4-8 GB RAM)Medium Models (16 GB RAM)Using the API ServerCommon Issues"Out of Memory" ErrorSlow InferenceManaging ModelsWhat's Next?

More Posts

Apple Silicon LLM Optimization — Get the Most from M1, M2, M3, and M4
Lists & GuidesTutorials

Apple Silicon LLM Optimization — Get the Most from M1, M2, M3, and M4

Tutorial

Optimize local AI performance on Apple Silicon. Covers Metal GPU acceleration, unified memory advantages, and the best models for each Mac chip generation.

avatar for Local AI Hub
Local AI Hub
2026/04/22
How to Run Llama Locally — Step-by-Step Guide for 2026
Models & HardwareTutorials

How to Run Llama Locally — Step-by-Step Guide for 2026

Tutorial

Run Meta's Llama models on your own computer. Covers Llama 3.2 and 3.1, model size selection by RAM, and step-by-step setup with Ollama and LM Studio.

avatar for Local AI Hub
Local AI Hub
2026/04/13
Cheapest Way to Run LLM — Local, Cloud, and Hybrid Options Compared
Cloud DeployLists & Guides

Cheapest Way to Run LLM — Local, Cloud, and Hybrid Options Compared

Guide

A cost-focused guide to running large language models. Compare local hardware costs, cloud GPU pricing, and find the cheapest approach for your situation.

avatar for Local AI Hub
Local AI Hub
2026/04/17
Local AI Hub

Run AI locally — fast, cheap, and private

Resources
  • Compare Tools
  • Tutorials
  • Cloud Deploy
  • Device Check
  • Blog
Company
  • About
  • Contact
Legal
  • Cookie Policy
  • Privacy Policy
  • Terms of Service
© 2026 Local AI Hub. All Rights Reserved.