Back to dollama.net

Docs

How the priority system works, what data flows where, and how to configure your tools.

Trust & security

Know exactly what you're sharing

Stays on your machine

Your files, repos, and working directory
Context assembly happens locally
Only the inference prompt leaves your machine

Visible in transit (v1)

The coordinator sees prompts in plaintext
Contributor nodes see the raw prompt
Not yet encrypted end-to-end
We don't log prompt content

Use it for

Open-source and hobby projects
Learning and experimentation
Non-sensitive coding tasks

Don't send secrets, credentials, or proprietary code you wouldn't share with a cloud API.

Full privacy details · Source code · End-to-end encryption planned for Phase 4

Priority

The more you give, the faster you go

No tokens, no blockchain, no marketplace. Just a running tally that rewards generosity.

Balance model

balance = tokens served − tokens consumed

Single queue — sorted by balance. Higher balance = served first.
Idle network — everyone gets instant service regardless of balance.
New users — start at zero. Use immediately, but contributors get priority when busy.

Concurrency & groups

Burst usage — 1–3 concurrent requests cost 1x each. 4+ cost 2x to discourage hogging.
Team pooling — groups share a balance. Run nodes on office machines, everyone benefits.
No speculation — balances can't be traded or sold. Coordination, not finance.

Your Hardware First

Unmetered — inference on your own node doesn't count against your balance.
Preemptive — your requests jump the queue on your own node, always.
Zero-cost local — no tokens deducted when your node handles your request.

Network behavior

	Idle network	Busy network
Response time	Instant	Queued by balance
Balance needed?	No — everyone served	Higher balance = faster
New users	Full access	Lower priority

Privacy

What we see, what we don't

Transparency over marketing. Here's exactly how your data flows through the network today.

What stays on your machine

Your files, repository context, and working directory never leave your machine. The local proxy assembles context locally — only the final inference prompt is sent to the network.

What the coordinator sees

All traffic routes through the coordinator relay in v1
Prompts are visible in plaintext — a known tradeoff for simplicity
We don't log prompt content
Treat it like any cloud API when deciding what to send

What contributor nodes see

Only the raw inference prompt and generated tokens. No file access, no user identity, no conversation history beyond the current request. Nodes are stateless — they process a prompt and move on.

Data	Your machine	Coordinator	Contributor node
Files & repo context	Local only	Never sent	Never sent
Inference prompt	Assembled here	Plaintext (v1)	Plaintext (v1)
User identity	Known	Token only	Anonymous
Conversation history	Full context	Per-request only	Per-request only

The roadmap: end-to-end encryption

Phase 4 introduces direct peer connections with end-to-end encryption. Until then, treat the network like any cloud API: don't send secrets you wouldn't send to a hosted LLM provider.

Configure

Use with Claude Code

Launch with one command

Run dollama launch claude — it starts the local proxy, configures the environment, and opens Claude Code automatically. That's it.

tip

Want to contribute too?

Run dollama network first, then dollama launch claude in another terminal — you'll use the network and share your idle compute.

Launch Claude Code

dollama launch claude

Or configure manually in ~/.claude/settings.json

{
  "env": {
    "ANTHROPIC_BASE_URL": "http://localhost:11435",
    "ANTHROPIC_API_KEY": "dollama-proxy"
  },
  "model": "network:qwen3.5:9b"
}

Pull the model (contributors)

ollama pull qwen3.5:9b

FAQ

Common questions

Can contributors read my prompts?

Yes — in v1, contributor nodes see the raw inference prompt in order to generate a response. They don't see your files, identity, or conversation history. Nodes are stateless and process one request at a time. Your files and repo context never leave your machine.

Is this safe for sensitive or proprietary code?

Not yet. In v1, prompts travel through the coordinator in plaintext. Don't send credentials, secrets, or proprietary code you wouldn't share with a cloud API. Use it for open-source projects, learning, and non-sensitive coding tasks. End-to-end encryption is coming in Phase 4.

What can the coordinator see?

The coordinator relays all traffic in v1, so it can see inference prompts in plaintext. We don't log prompt content, and no data is sold or shared with third parties. This is a known tradeoff for simplicity — see the Privacy section for details.

What protections are planned next?

Phase 4 introduces direct peer connections (WebRTC/QUIC) with end-to-end encryption between your machine and the contributor node. The coordinator would handle routing only — it would no longer see prompt content. The entire codebase is open source.

Why Qwen 3.5 9B?

It hits the sweet spot: fast enough to run on consumer hardware, capable enough for agentic coding tasks. A single model keeps routing simple. More models may come in future phases.

What do I need to contribute?

Ollama installed with qwen3.5:9b pulled. Then run dollama network. The CLI handles registration, heartbeats, and routing automatically. Any machine that can run Ollama can contribute.

Can I use this with tools other than Claude Code?

Yes. The local proxy exposes an Anthropic Messages API endpoint at localhost:11435. Any tool that supports a custom base URL can use it — Continue, Aider, or your own scripts. See Connect your app for a step-by-step guide (including Raspberry Pi and headless setups).

Why a doe and not a llama?

We like llamas — Ollama is the backbone of this project. But a doe felt right for what we're building: gentle, graceful, and part of a herd. Plus, dollama → doe. It was right there the whole time.