Connect your app
dollama isn't just for Claude Code. Anything that can point at a custom base URL — Continue, Aider, a home-assistant bot, your own script — can run on the network. The fastest way is the local proxy.
dollama isn't just for Claude Code. Anything that can point at a custom base URL — Continue, Aider, a home-assistant bot, your own script — can run on the network. The fastest way is the local proxy.
When you run dollama private, the CLI starts a small local proxy that listens on http://localhost:11435 and speaks the Anthropic Messages API. Your app talks to that local address exactly as if it were talking to Anthropic directly — the proxy handles authentication, routing to an available node, context optimization, and SSE streaming for you.
Because it's a standard Anthropic-compatible endpoint, any tool that lets you override the API base URL works with no code changes. Your files, repo, and tool execution never leave your machine — only the inference prompt is sent to the network.
macOS / Linux (including Raspberry Pi on a 64-bit OS):
curl -fsSL https://dollama.net/install.sh | sh
This runs in the foreground and prints the address it's listening on. On first run it auto-provisions an API token. Add --no-tray on a headless box (see Raspberry Pi & headless).
dollama private
Set your app's Anthropic base URL to the local proxy and use the network model. Details below.
export ANTHROPIC_BASE_URL="http://localhost:11435"
export ANTHROPIC_API_KEY="dollama-proxy"
export ANTHROPIC_MODEL="network:qwen3.5:9b"
Most apps expose these as environment variables or settings. The values are always the same:
| Setting | Value | Notes |
|---|---|---|
| Base URL | http://localhost:11435 | Where the proxy listens. Override the port with dollama private -p <port>. |
| API key | dollama-proxy | A placeholder — the proxy supplies the real token. Any non-empty value works. |
| Model | network:qwen3.5:9b | The network worker model. See Docs for the model tiers. |
Remote device? If your app runs on a different machine than the proxy, bind/point at that machine's IP instead of localhost and make sure the proxy port is reachable on your LAN. Treat the proxy as trusted-LAN only — don't expose it to the public internet.
One command does everything — starts the proxy, sets the env vars, and launches Claude Code:
dollama launch claude
Start dollama private in one terminal, then run your app with the env vars set. This is the generic pattern for Continue, Aider, home-assistant bots, and your own scripts.
ANTHROPIC_BASE_URL="http://localhost:11435" \
ANTHROPIC_API_KEY="dollama-proxy" \
ANTHROPIC_MODEL="network:qwen3.5:9b" \
your-app
Point the official SDK at the proxy by setting base_url:
from anthropic import Anthropic
client = Anthropic(
base_url="http://localhost:11435",
api_key="dollama-proxy",
)
msg = client.messages.create(
model="network:qwen3.5:9b",
max_tokens=512,
messages=[{"role": "user", "content": "Say hello from the dollama network."}],
)
print(msg.content[0].text)
The endpoint is plain Anthropic Messages — anything that can POST JSON works:
curl --no-buffer -X POST http://localhost:11435/v1/messages \
-H "Content-Type: application/json" \
-H "x-api-key: dollama-proxy" \
-d '{
"model": "network:qwen3.5:9b",
"max_tokens": 512,
"stream": true,
"messages": [{"role": "user", "content": "Hello from my app"}]
}'
The CLI ships native Linux arm64 builds, so it runs on a Raspberry Pi — perfect for always-on assistants and bots. A couple of things to know first:
armv7) binaries — check with uname -m (you want aarch64).
dollama network — that needs a real GPU/Ollama to do inference.
--no-tray so it never tries to start a system-tray UI.
dollama private --no-tray
Drop this in ~/.config/systemd/user/dollama-connect.service, then systemctl --user enable --now dollama-connect:
[Unit]
Description=dollama private proxy
After=network-online.target
[Service]
ExecStart=%h/.local/bin/dollama private --no-tray
Restart=on-failure
[Install]
WantedBy=default.target
Adjust ExecStart to wherever the installer placed the binary (which dollama).
If you can't or don't want to run the CLI on a device — say it's too constrained, or you're calling from a serverless function — your app can call the relay's public API directly at https://api.dollama.net/v1/messages. You'll need an olm_ API token in the Authorization header, which you can mint by running dollama login (or dollama private once) on any machine.
Trade-off: the local proxy gives you context optimization, automatic token handling, and keeps traffic on your LAN until the last hop. The direct path is simpler to embed but you manage the token yourself. Full request/response shapes, streaming events, and more language examples are on the API reference.
curl --no-buffer -X POST https://api.dollama.net/v1/messages \
-H "Authorization: Bearer olm_your_token_here" \
-H "Content-Type: application/json" \
-d '{
"model": "network:qwen3.5:9b",
"max_tokens": 512,
"stream": true,
"messages": [{"role": "user", "content": "Hello from my app"}]
}'
dollama private is up and the base URL port matches (default 11435).
dollama status, or trace a single request with dollama debug trace.
uname -m reports aarch64. A 32-bit OS won't have a matching binary.
ANTHROPIC_BASE_URL, set that env var; otherwise look for a base-URL setting in its config.
Every flag and config key lives in the CLI reference. For privacy specifics (what nodes can see), see the Docs FAQ.