Local AI Hub
  • Compare Tools
  • Tutorials
  • Cloud Deploy
  • Blog
Run Open WebUI on Runpod — Cloud ChatGPT in 10 Minutes
2026/04/16
Intermediate30 min

Run Open WebUI on Runpod — Cloud ChatGPT in 10 Minutes

Deploy Open WebUI with Ollama on Runpod for a private, ChatGPT-like experience on cloud GPU. Access your AI assistant from any device with a web browser.

Want a ChatGPT-like experience running on your own cloud GPU? Open WebUI on Runpod gives you a beautiful browser interface with full model control, document chat, and multi-user support — all private and self-hosted.

What You'll Build

  • Open WebUI accessible from any browser
  • Powered by Ollama on a cloud GPU
  • Persistent storage for models and conversations
  • Multi-user accounts (optional)

Step 1: Create a Network Volume

  1. Go to Storage → Network Volumes in Runpod
  2. Click Add Network Volume
  3. Size: 50 GB
  4. Data Center: Remember which one (must match your GPU)

Step 2: Deploy a GPU Instance with Ollama

  1. Go to GPU Cloud → Deploy
  2. Choose a GPU (RTX 4090 recommended for best value)
  3. Select the same data center as your volume
  4. Use the Ollama community template
  5. Attach your network volume at /workspace
  6. Deploy and wait for it to start

Step 3: Connect and Prepare Ollama

Connect via HTTP Proxy terminal:

# Set persistent model storage
export OLLAMA_MODELS=/workspace/ollama/models
mkdir -p /workspace/ollama/models

# Stop default service and restart with correct config
sudo systemctl stop ollama 2>/dev/null || true
OLLAMA_MODELS=/workspace/ollama/models OLLAMA_HOST=0.0.0.0:11434 ollama serve > /workspace/ollama.log 2>&1 &

# Download your preferred models
sleep 5
ollama pull llama3.1:8b
ollama pull qwen2.5:7b
ollama pull deepseek-r1:8b

Step 4: Deploy Open WebUI

Run Open WebUI in Docker on the same instance:

docker run -d -p 3000:8080 \
  -e OLLAMA_BASE_URL=http://host.docker.internal:11434 \
  -v /workspace/open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:main

This connects Open WebUI to your local Ollama instance and stores data persistently.

Step 5: Access Open WebUI

  1. Go to your instance settings
  2. Expose port 3000
  3. Open the proxy URL in your browser: https://your-pod-id.proxy.runpod.net-3000.proxy.runpod.net

Alternatively, connect via the Runpod HTTP Proxy on port 3000.

Step 6: Set Up Your Account

  1. Open WebUI shows a registration page on first visit
  2. Create your admin account (this is stored locally, not in any cloud)
  3. You're now in a ChatGPT-like interface powered by your own cloud GPU

Usage Tips

Starting a Chat

  1. Select a model from the dropdown (the models you pulled in Step 3 appear here)
  2. Type your message and press Enter
  3. The response comes from your cloud GPU — private and fast

Document Chat (RAG)

  1. Click the + button or drag files into the chat
  2. Upload PDFs, text files, or paste web URLs
  3. Ask questions about the documents
  4. Open WebUI searches the documents and provides cited answers

Multi-User Setup

  1. As admin, go to Settings → Users
  2. Enable registration or create accounts manually
  3. Each user gets their own conversation history
  4. Models and documents can be shared or kept private

Cost Management

Recommended Setup for Cost Efficiency

  • RTX 4090 at $0.44/hr
  • Auto-Stop set to 1 hour of inactivity
  • Spot instance for even lower cost (with interruption risk)

Monthly Cost Estimates

UsageGPUMonthly Cost
2 hrs/day weekdaysRTX 4090~$18
4 hrs/day weekdaysRTX 4090~$35
8 hrs/day weekdaysRTX 4090~$70

Auto-Start Script

Create a script to restart both services after instance restart:

cat > /workspace/start-all.sh << 'EOF'
#!/bin/bash
export OLLAMA_MODELS=/workspace/ollama/models
export OLLAMA_HOST=0.0.0.0:11434

# Start Ollama
pkill ollama 2>/dev/null || true
sleep 2
ollama serve > /workspace/ollama.log 2>&1 &
sleep 5

# Start Open WebUI
docker start open-webui 2>/dev/null || docker run -d -p 3000:8080 \
  -e OLLAMA_BASE_URL=http://host.docker.internal:11434 \
  -v /workspace/open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:main

echo "All services started!"
ollama list
EOF

chmod +x /workspace/start-all.sh

Troubleshooting

Open WebUI can't connect to Ollama: Verify Ollama is running with ollama list. Check that OLLAMA_HOST=0.0.0.0:11434 is set.

Port 3000 not accessible: Make sure the port is exposed in Runpod instance settings.

Models not showing: Verify OLLAMA_MODELS points to /workspace/ollama/models and models were pulled successfully.

Slow responses: Check if the model fits in your GPU's VRAM. An RTX 4090 (24GB) handles models up to 14B comfortably.

Summary

You now have a private, ChatGPT-like experience running on cloud GPU. Open WebUI handles the interface while Ollama runs the models. Data persists between sessions, and you can access it from any browser.

Next Steps

  • Deploy Ollama on Runpod — deeper Ollama configuration
  • Ollama vs Open WebUI — understand how they work together
  • Best GPU Cloud for LLM — compare cloud providers
Deploy Open WebUI on Runpod for a private cloud AI experience.
Get started with Runpod for cloud GPU computing. No hardware upgrades needed — run any AI model on powerful remote GPUs.
Get Started with Runpod

Partner link. We may earn a commission at no extra cost to you.

All Posts

Author

avatar for Local AI Hub
Local AI Hub

Categories

  • Cloud Deploy
  • Tutorials
What You'll BuildStep 1: Create a Network VolumeStep 2: Deploy a GPU Instance with OllamaStep 3: Connect and Prepare OllamaStep 4: Deploy Open WebUIStep 5: Access Open WebUIStep 6: Set Up Your AccountUsage TipsStarting a ChatDocument Chat (RAG)Multi-User SetupCost ManagementRecommended Setup for Cost EfficiencyMonthly Cost EstimatesAuto-Start ScriptTroubleshootingSummaryNext Steps

More Posts

Can 16GB RAM Run LLMs? (And Can Your Mac Run Them?)
Lists & GuidesModels & Hardware

Can 16GB RAM Run LLMs? (And Can Your Mac Run Them?)

Guide

Yes, 16GB RAM is excellent for local AI. This guide covers what models run on 16GB, why Apple Silicon Macs are ideal, and how to get the best performance.

avatar for Local AI Hub
Local AI Hub
2026/04/14
Private AI Setup Guide — Run AI Completely Offline in 2026
Tutorials

Private AI Setup Guide — Run AI Completely Offline in 2026

Tutorial

A step-by-step guide to setting up a fully private, offline AI system. No data leaves your machine — covers model selection, tools, and privacy best practices.

avatar for Local AI Hub
Local AI Hub
2026/04/20
Best AI Models for 8GB RAM — What Can You Run Locally?
Lists & GuidesModels & Hardware

Best AI Models for 8GB RAM — What Can You Run Locally?

Guide

A complete guide to the best LLMs you can run on a computer with 8GB of RAM. Includes benchmarks, practical recommendations, and setup commands for each model.

avatar for Local AI Hub
Local AI Hub
2026/04/10
Local AI Hub

Run AI locally — fast, cheap, and private

Resources
  • Compare Tools
  • Tutorials
  • Cloud Deploy
  • Device Check
  • Blog
Company
  • About
  • Contact
Legal
  • Cookie Policy
  • Privacy Policy
  • Terms of Service
© 2026 Local AI Hub. All Rights Reserved.