The Distributed Open Llama Network

AI that keeps working

Turn the computers you already own into a personal AI cluster.

Use your desktop GPU from a lightweight laptop
Run tasks in parallel across your machines
Draw on free network capacity when you need more

Install Dollama How it works

0 tokens processed

0 tok/s capacity

0 total users

0 requests served

Your cluster

One cluster. All your machines.

Your most capable computer should not become useless the moment you leave your desk. Dollama connects the hardware you already own so it can work together.

Use your desktop GPU from a lightweight laptop

Keep models and agents running on an always-on home server. Send work to whichever of your machines is best equipped for it.

Run several independent tasks at once

Run tasks across your machines in parallel. Continue working when one machine is busy or unavailable.

Keep sensitive work within hardware you control

Your devices remain independent computers. Dollama gives them a shared coordination layer.

How it works

More capable together

Dollama is more than a remote connection to one model. It coordinates different models and computers so each part of a task runs where it fits best.

Planner, worker, and helper roles

A powerful planner model may be used briefly to understand a difficult request, break it into manageable work, and choose the right tools. Smaller worker models can then perform most of the task efficiently.

Parallel execution across machines

When several investigations can happen independently — and you have more than one machine to run them on — they can run at the same time. Helper and sub-agent work can distribute across your cluster.

Recovery and continuation

If a worker stops early, Dollama can prompt it to continue. If an attempt fails, the system can retry or route the work elsewhere. Individual models can still make mistakes. The whole system does not have to fail with them.

Smaller models do most of the work. Stronger models help them succeed. A few powerful planner nodes can improve the success of a much larger pool of ordinary machines.

Routing

Your hardware first

Dollama starts with the resources available to you.

A machine you own

Your requests prefer your own hardware first. A laptop on battery may use a desktop at home.

A machine shared by a group

Next, machines shared by groups you belong to. A smaller local model may receive a plan from a stronger group node.

An eligible contributor on the open network

Independent work may be distributed across several workers. The goal is to complete the task well using the resources available.

Open network

The network when you need it

You do not need an expensive GPU to benefit from Dollama. Even a single modest machine — or one with no GPU at all — can use the open network.

Access capabilities you can't run locally

Stronger planner models. Additional worker and helper capacity. Models your hardware can't run. Concurrent execution across other people's machines.

Everyone benefits differently

Your computer does not need to run every model itself. People with a whole suite of machines get more from their own hardware; people with one machine still get coordinated models, recovery, and free network capacity.

Pricing

Free to access

Anyone can submit work to the open network without payment.

No fees

No subscription. No per-token fee. No network usage fee. No paid membership fee. Contribution affects priority rather than permission.

Priority through contribution

People who contribute useful compute receive faster access when the network is busy. Requests from people who contribute less may spend longer in the queue during periods of sustained demand.

Compute itself is not free. Nodes consume real electricity, hardware, bandwidth, and time. Those resources are supplied by people and organizations that choose to participate. Dollama coordinates that shared capacity without selling it back to the community one token at a time.

Infrastructure

Keep your tools

Dollama is infrastructure, not another closed AI application.

Works with your existing tools

Point a compatible coding tool, agent, chat client, or application at a local Dollama endpoint. Dollama handles where inference runs while tools, credentials, and project access remain on your machine.

Integrates everywhere

Coding assistants. Agent frameworks. Chat applications. Research and RAG systems. Automation tools. Private organizational software. Products built by other teams. You should not have to replace your workflow just to change where the intelligence runs.

Open source

Open-source infrastructure

Dollama is open source. You can inspect it, run it, build on it, adapt it, and contribute improvements.

Current runtime

Node discovery and routing. Planner, worker, and helper roles. Context handling. Retries and recovery. Contribution accounting. Compatible API endpoints. Relays. Reference applications.

Longer-term goal

A clearly separated open protocol that other teams can implement independently. Clients, nodes, schedulers, runtimes, relays, private networks, applications. The Dollama implementation should be an easy way to join the network — not the only way.

Privacy

Honest privacy

No suspiciously sparkly privacy claims. Just an honest description of where trust still exists.

Private mode

Work runs only on nodes you own. If none of your own machines are available, the request fails rather than silently falling through to a group or the public network.

dollama private

Group mode

Work may run on your own nodes plus nodes shared by groups you belong to. Like Private, Group fails closed: if no eligible own-or-group node is available, the request fails rather than falling through to the public network.

dollama group

Open network mode

Work may run on eligible community contributors. Use this mode only for tasks you are comfortable sending to an untrusted computer.

dollama network

What stays on your machine

Tools, credentials, project access, and private local resources
Context is assembled locally
Only the inference prompt leaves your machine

Important caveat

A machine that executes a prompt may be able to inspect that prompt. Tunnel traffic still passes through relay infrastructure. Dollama is not funded by collecting, profiling, or monetizing your prompts. That does not remove the need to understand which machines execute them.

Features

Available today

Compatible local endpoints for coding tools and applications
Routing across your own nodes, your group's nodes, and public contributors
Planner, worker, and helper orchestration
Concurrent helper and sub-agent work across multiple machines
Context preparation and management

Retries, fallback, and automatic continuation
Streaming responses and cancellation
Contribution-based priority
Local execution of tools and project access
Private, Group, and Open network modes

Get started

First inference in under two minutes

🦙

Ollama

Required runtime

💾

8GB+ RAM

Recommended

💻

macOS / Linux / Win

Cross-platform

🧠

qwen3.5:9b

Network model

Terminal

curl -fsSL https://dollama.net/install.sh | sh

Requires Ollama installed with qwen3.5:9b pulled if you plan to contribute.

PowerShell

irm https://dollama.net/install.ps1 | iex

Open Windows PowerShell (not Command Prompt) and paste the line above — irm and iex are PowerShell commands and won't work in cmd.exe. The installer downloads dollama, sets up Ollama, adds a Start Menu shortcut, and opens the dashboard for you.

Aliases disabled? Use Invoke-RestMethod https://dollama.net/install.ps1 | Invoke-Expression instead.

Requires Ollama installed with qwen3.5:9b pulled if you plan to contribute.

Windows may show a SmartScreen warning — click "More info" then "Run anyway". This is normal for unsigned open-source software. Troubleshooting

Pick how your work routes

The installer already launched dollama in your system tray. Choose your mode:

dollama private   # your own machines only

dollama group     # your own + your groups' machines

dollama network   # the full open network (and contribute)

Launch Claude Code

dollama launch claude

Start coding

That's it. You're running on community compute. Dollama exposes an Anthropic Messages API endpoint at localhost:11435. Any tool that supports a custom base URL can use it — Claude Code, Continue, Aider, or your own scripts.

What the installer does: downloads the latest dollama binary for your platform, verifies the checksum, and installs it to ~/.dollama/bin (macOS/Linux) or %LOCALAPPDATA%\dollama (Windows). On macOS it also creates ~/Applications/dollama.app for Spotlight/Launchpad; on Windows it adds a Start Menu shortcut. All platforms open the dashboard after install. Read the script source.

Direct binary download

macOS (Apple Silicon) macOS (Intel) Linux (x86_64) Linux (ARM64) Windows (x86_64) Windows (ARM64)

After extracting on Windows: move dollama.exe to %LOCALAPPDATA%\dollama, add that folder to PATH, then run dollama install-app and launch via the Start Menu → Dollama shortcut (not by double-clicking the zip download).

Verify checksums

curl -fsSL https://dollama.net/dl/latest/checksums.txt
sha256sum -c checksums.txt

Build from source

git clone https://github.com/notangrywaffle/dollama.net.git
cd dollama.net/cli && make build

Live network

The herd at a glance

Concurrent capacity

Messages handled simultaneously

Processing capacity

Theoretical tok/s

Requests served

All time

Tokens processed

All time

Contributors

All time unique node owners

Users

All time

Works with

Claude Code Continue Aider Any Anthropic-compatible tool

Open source

Full source code on GitHub. Relay, CLI, installer, and this website — all public.

Integrates with

Ollama Claude Code

Current network models

Qwen 3.5 9B Gemma 4 26B Qwen 3.5 4B

Roadmap

Built for resilience

Models change. Providers change. Prices change. Machines fail. Companies disappear. Your access should not disappear with them.

Dollama is working toward:

Easier personal-cluster setup
Private routing through trusted groups (now available as Group mode)
Automated review and correction of completed work
End-to-end-encrypted participant traffic
More direct connections between nodes

Independently operated relays
A separately documented protocol
Alternative compatible runtimes
Persistent agents and personal data hosted at the edge
A network that does not depend permanently on Dollama-operated infrastructure

Models should remain replaceable. Applications should remain independent. Relays should remain replaceable. The network should exist for as long as people continue to run it.

FAQ

Common questions

On the open network, a node that executes your prompt can see it to generate a response. Nodes are stateless — they don't see your files, identity, or conversation history. Use Private or Group mode to keep work within hardware you (or your trusted group) control. End-to-end encryption between participants is on the roadmap, but is not in place today.

On the open network, treat it like any cloud API — keep secrets and proprietary material out. Private mode (your own machines only) is the appropriate mode for sensitive work, with the caveat that traffic still reaches the relay in transit.

No. CPU inference works, just slower — and even a machine that can't serve at all can use the network as a client and draw on other people's compute.

No. But when the network is busy, people who contribute useful compute get served first. When it's idle, everyone gets fast access.

No. One machine is enough to benefit — you still get planner guidance, recovery, routing, and free network capacity. A whole suite of machines simply gives you more.

Ollama is the backbone of this project, and we love llamas. But a doe felt right for what we're building: gentle, graceful, part of a herd. Plus, dollama → doe.