Back to dollama.net

Docs

How the priority system works, what data flows where, and how to configure your tools.

Know exactly what you're sharing

Stays on your machine

  • Your files, repos, and working directory
  • Context assembly happens locally
  • Only the inference prompt leaves your machine

Visible in transit (v1)

  • The coordinator sees prompts in plaintext
  • Contributor nodes see the raw prompt
  • Not yet encrypted end-to-end
  • We don't log prompt content

Use it for

  • Open-source and hobby projects
  • Learning and experimentation
  • Non-sensitive coding tasks

Don't send secrets, credentials, or proprietary code you wouldn't share with a cloud API.

The more you give, the faster you go

No tokens, no blockchain, no marketplace. Just a running tally that rewards generosity.

Balance model

balance = tokens served − tokens consumed
  • Single queue — sorted by balance. Higher balance = served first.
  • Idle network — everyone gets instant service regardless of balance.
  • New users — start at zero. Use immediately, but contributors get priority when busy.

Concurrency & groups

  • Burst usage — 1–3 concurrent requests cost 1x each. 4+ cost 2x to discourage hogging.
  • Team pooling — groups share a balance. Run nodes on office machines, everyone benefits.
  • No speculation — balances can't be traded or sold. Coordination, not finance.

Your Hardware First

  • Unmetered — inference on your own node doesn't count against your balance.
  • Preemptive — your requests jump the queue on your own node, always.
  • Zero-cost local — no tokens deducted when your node handles your request.

Network behavior

Idle network Busy network
Response time Instant Queued by balance
Balance needed? No — everyone served Higher balance = faster
New users Full access Lower priority

What we see, what we don't

Transparency over marketing. Here's exactly how your data flows through the network today.

What stays on your machine

Your files, repository context, and working directory never leave your machine. The local proxy assembles context locally — only the final inference prompt is sent to the network.

What the coordinator sees

  • All traffic routes through the coordinator relay in v1
  • Prompts are visible in plaintext — a known tradeoff for simplicity
  • We don't log prompt content
  • Treat it like any cloud API when deciding what to send

What contributor nodes see

Only the raw inference prompt and generated tokens. No file access, no user identity, no conversation history beyond the current request. Nodes are stateless — they process a prompt and move on.

Data Your machine Coordinator Contributor node
Files & repo context Local only Never sent Never sent
Inference prompt Assembled here Plaintext (v1) Plaintext (v1)
User identity Known Token only Anonymous
Conversation history Full context Per-request only Per-request only

The roadmap: end-to-end encryption

Phase 4 introduces direct peer connections with end-to-end encryption. Until then, treat the network like any cloud API: don't send secrets you wouldn't send to a hosted LLM provider.

Use with Claude Code

1

Launch with one command

Run dollama launch claude — it starts the local proxy, configures the environment, and opens Claude Code automatically. That's it.

tip

Want to contribute too?

Run dollama both first, then dollama launch claude in another terminal — you'll use the network and share your idle compute.

Launch Claude Code
dollama launch claude
Or configure manually in ~/.claude/settings.json
{
  "env": {
    "ANTHROPIC_BASE_URL": "http://localhost:11435",
    "ANTHROPIC_API_KEY": "dollama-proxy"
  },
  "model": "network:llama3.1:8b"
}
Pull the model (contributors)
ollama pull llama3.1:8b

Common questions

Yes — in v1, contributor nodes see the raw inference prompt in order to generate a response. They don't see your files, identity, or conversation history. Nodes are stateless and process one request at a time. Your files and repo context never leave your machine.
Not yet. In v1, prompts travel through the coordinator in plaintext. Don't send credentials, secrets, or proprietary code you wouldn't share with a cloud API. Use it for open-source projects, learning, and non-sensitive coding tasks. End-to-end encryption is coming in Phase 4.
The coordinator relays all traffic in v1, so it can see inference prompts in plaintext. We don't log prompt content, and no data is sold or shared with third parties. This is a known tradeoff for simplicity — see the Privacy section for details.
Phase 4 introduces direct peer connections (WebRTC/QUIC) with end-to-end encryption between your machine and the contributor node. The coordinator would handle routing only — it would no longer see prompt content. The entire codebase is open source.
It hits the sweet spot: fast enough to run on consumer hardware, capable enough for agentic coding tasks. A single model keeps routing simple. More models may come in future phases.
Ollama installed with llama3.1:8b pulled. Then run dollama serve. The CLI handles registration, heartbeats, and routing automatically. Any machine that can run Ollama can contribute.
Yes. The local proxy exposes an Anthropic Messages API endpoint at localhost:11435. Any tool that supports a custom base URL can use it — Continue, Aider, or your own scripts.
We like llamas — Ollama is the backbone of this project. But a doe felt right for what we're building: gentle, graceful, and part of a herd. Plus, dollama → doe. It was right there the whole time.