LLamaPass — Self-hosted API Gateway for Ollama

Get started

LLamaPass is open-source software you install on your server to manage access to Ollama

Join this server

Use this instance

The admin of this server has shared LLamaPass with you. Register with your invite code to get access.

Register with an invite code for instant access
Generate your API key from the dashboard
Call models via API or OpenAI SDK

Deploy your own

Run LLamaPass on your server

Install LLamaPass with Docker to manage multi-user access to your own Ollama instance.

Clone and run with a single Docker command
Connect to your Ollama instance
Invite users and manage their access

View on GitHub

Use from the terminal

Install the CLI and start chatting with models instantly

terminal

# Install the CLI
$ pip install llamapass

# Configure your API key
$ llamapass config set-key oah_your_key

# Chat with any model
$ llamapass run gemma3
LlamaPass - gemma3 (type /bye to quit)

>>> Hello!
Hi there! How can I help you today?

Deploy in minutes

Clone, configure, and run with Docker

terminal

# Clone and run with Docker
$ git clone https://github.com/edoardoted99/llamapass.git
$ cd llamapass
$ cp .env.example .env
$ docker compose up --build

# Create your first admin user
$ docker compose exec web python manage.py createsuperuser

# Ready at http://localhost:8000

Everything you need

A complete gateway between your users and Ollama

🔑

API Key Management

Create, revoke, and manage keys with expiration, per-key model restrictions, and rate limits.

📊

Usage Dashboard

30-day analytics with charts for requests, tokens, latency, errors, and model breakdown per key.

⚡

Rate Limiting

Configurable per-key limits backed by Redis. Live monitoring shows how close each key is to its limit.

🔄

Streaming Proxy

Transparent async proxy to Ollama. Full streaming support for chat and generate endpoints.

🧪

Built-in Test Page

Test Chat, Generate, and Embeddings endpoints directly from the browser. No curl needed.

🤝

OpenAI Compatible

Works with OpenAI SDKs out of the box. Just point your base URL and use your LLamaPass API key.

Simple to integrate

Use any HTTP client or the OpenAI SDK

curl

curl https://llamapass.org/ollama/api/chat \
  -H "Authorization: Bearer oah_your_key" \
  -d '{
  "model": "gemma3:1b",
  "messages": [
    {"role": "user", "content": "Hello!"}
  ],
  "stream": false
}'

Python — OpenAI SDK

from openai import OpenAI

client = OpenAI(
    base_url="https://llamapass.org/ollama/v1",
    api_key="oah_your_key",
)

response = client.chat.completions.create(
    model="gemma3:1b",
    messages=[
      {"role": "user", "content": "Hello!"}
    ],
)
print(response.choices[0].message.content)

Your Ollama.One gateway. Every user.