dollama.net — CLI Reference

Core modes

The three ways to run dollama

dollama private

Start as a local proxy that forwards LLM requests to your own and group nodes. Your coding tools talk to localhost:11435 and the proxy handles routing, auth, and streaming.

Flag	Type	Default	Description
-p, --port	int	11435	Local proxy listen port
--model	string	(relay)	Model to route requests to
--no-tray	bool	false	Run without system tray (headless)
--verbose	bool	false	Enable verbose debug logging

dollama network

Full network mode: use the network and share your idle compute. Runs the local proxy on localhost:11435 and, when your hardware qualifies, contributes your local Ollama instance to the network over a WebSocket.

Flag	Type	Default	Description
-p, --port	int	11435	Local proxy listen port
--ollama-url	string	http://localhost:11434	Local Ollama API URL
--model	string	(relay)	Model to advertise
--no-tray	bool	false	Run without system tray (headless)
--verbose	bool	false	Enable verbose debug logging

dollama app

Launch the system tray application with a web dashboard for configuration and control. This is the default when running dollama with no arguments.

Launch commands

One-command tool integration

dollama launch claude

Start the local proxy, set ANTHROPIC_BASE_URL, and launch Claude Code automatically. The fastest way to get started.

Flag	Type	Default	Description
-p, --port	int	11435	Local proxy listen port
--model	string	(relay)	Model to use
--opus	bool	false	Map claude-opus requests to network model
--sonnet	bool	false	Map claude-sonnet requests to network model
--haiku	bool	false	Map claude-haiku requests to network model

Configuration

Config commands & file reference

dollama config

Show current configuration with all fields, values, and defaults.

dollama config path

Print the config file path (typically ~/.dollama/config.toml).

dollama config reference

Print a template with all config fields documented.

Config file reference

All settings live in ~/.dollama/config.toml. CLI flags override config values.

Core

Key	Type	Default	Description
relay_url	string	https://api.dollama.net	Relay server URL
ollama_url	string	http://localhost:11434	Local Ollama API URL
model	string	qwen3.5:9b	Default model for inference and serving
listen_port	int	11435	Proxy listen port (private mode)
dashboard_port	int	11436	Web dashboard port
mode	string	""	Active mode: private, network, or "" (idle)

Auth

Key	Type	Default	Description
relay_token	string	""	API token (auto-generated on first run)
relay_node_id	string	""	Node UUID from registration
relay_secret	string	""	Node secret from registration
user_id	string	""	Stable user ID from relay
github_login	string	""	GitHub username (set via `dollama login`)

Model Aliases

Key	Type	Default	Description
opus_model	string	""	Model for claude-opus requests
sonnet_model	string	""	Model for claude-sonnet requests
haiku_model	string	""	Model for claude-haiku requests

Behavior

Key	Type	Default	Description
auto_update	bool	true	Auto-update when new versions are available
battery_mode	string	pause_serve	On battery: pause_serve, keep_running, or stop_all
serve_num_ctx	int	0	Override Ollama num_ctx per request (0 = model default)
language	string	(system)	Response language (e.g. "English")

Optimization

Key	Type	Default	Description
optimize_enabled	bool	true	Master switch for context optimization
optimize_tools_enabled	bool	true	Enable tool filtering
optimize_tools_strip_descriptions	bool	true	Strip tool descriptions
optimize_tools_simplify_schemas	bool	true	Simplify tool schemas
optimize_tools_max_tools	int	0	Max tools to keep (0 = unlimited)
optimize_context_enabled	bool	true	Enable context truncation
optimize_context_max_tokens	int	6000	Max context tokens
optimize_context_truncate_tool_results	int	4000	Truncate tool result tokens
optimize_context_prune_stale	bool	true	Prune stale messages
optimize_format_enabled	bool	true	Enable format optimization (markdown, whitespace, etc.)

Terms

Key	Type	Default	Description
terms_accepted_use	bool	false	Accepted terms for connect (use) mode
terms_accepted_give	bool	false	Accepted terms for serve (give) mode

Authentication

Login, logout, identity

dollama login

Authenticate with GitHub using the Device Authorization Flow. Opens a browser to github.com/login/device where you enter a one-time code. Links your GitHub account to your dollama identity.

dollama logout

Clear your authentication token. Optionally revokes the token server-side via POST /v1/auth/token/revoke.

dollama whoami

Display the currently authenticated GitHub account and user ID.

Status

Network health

dollama status

Display the current state of the network including online nodes, capacity, and supported models. Calls GET /v1/status under the hood.

Debug tools

Diagnose, test, trace

dollama debug trace

Send a minimal test request and trace its lifecycle through the network step-by-step. Useful for diagnosing why requests fail silently.

Flag	Type	Default	Description
--relay-url	string	(config)	Relay URL
--token	string	(config)	Authentication token
-V, --verbose	bool	false	Show raw SSE frames

dollama debug smoke

Post-deploy smoke test: deep health check, node count, trace request, network health. Exits 0 on success, 1 on failure.

Flag	Type	Default	Description
--relay-url	string	(config)	Relay URL
--token	string	(config)	Authentication token
--timeout	duration	60s	Overall smoke test timeout

dollama debug simulate-node

Simulate fake compute nodes that connect via WebSocket and complete tasks with synthetic responses. No real Ollama needed.

Flag	Type	Default	Description
--count	int	1	Number of simulated nodes
--delay	duration	500ms	Simulated inference delay
--tokens	int	20	Simulated output token count
--fail-rate	float	0	Fraction of requests that error (0.0-1.0)
--timeout-rate	float	0	Fraction of requests that timeout (0.0-1.0)
--model	string	(relay)	Model to register as
--relay-url	string	(config)	Relay URL
--token	string	(config)	Authentication token

dollama debug load-test

Send concurrent test requests to measure end-to-end performance including routing, queue wait, and streaming latency.

Flag	Type	Default	Description
--concurrent	int	5	Number of parallel requests
--count	int	20	Total requests to send
--timeout	duration	120s	Per-request timeout
--relay-url	string	(config)	Relay URL
--token	string	(config)	Authentication token

dollama debug terrarium

Run full-stack integration tests (relay terrarium + CLI E2E). Spins up an in-process test server and exercises the complete data path.

Flag	Type	Default	Description
--scenarios	strings	all	Scenarios: happy-path, concurrent, node-failures, node-timeouts, node-disconnect, zero-nodes, burst-drain, sse-passthrough, idle-timeout, or all
--suite	string	all	Test suite: relay, e2e, or all
--replay-case	strings		Replay fixture case(s) for capture-replay tests
--verbose	bool	false	Pass -v to go test
--short	bool	false	Skip slow tests (idle-timeout)
--timeout	duration	5m	Overall test timeout

dollama debug tools

Test whether a model supports function calling (tool use) by sending a minimal tool-calling request to local Ollama.

Flag	Type	Default	Description
--ollama-url	string	http://localhost:11434	Local Ollama API URL
--model	string	(config)	Model to test
-V, --verbose	bool	false	Show detailed response content

dollama debug compress

Apply aggressive context compression to a captured Claude Code request JSON. Useful for analyzing compression effectiveness.

Flag	Type	Default	Description
--stats	bool	false	Print size report only, no JSON output
-o, --output	string	""	Write compressed JSON to file
--model	string	""	Override model name

Benchmark & eval

Measure your hardware

dollama benchmark

Run a benchmark suite against your local Ollama instance to measure TPS, max context length, and concurrency limits. Results saved to ~/.dollama/benchmark.json.

Flag	Type	Default	Description
--ollama-url	string	http://localhost:11434	Local Ollama API URL
--model	string	(config)	Model to benchmark
--max-context	int	0	Max context length (0 = default 65536)
--conversation	bool	false	Run conversation degradation benchmark
--turns	int	0	Number of conversation turns (default 20)

dollama eval

Run agent workload evaluation against your local Ollama. Tests real-world coding scenarios to measure model capability.

Flag	Type	Default	Description
--ollama-url	string	http://localhost:11434	Local Ollama API URL
--model	string	(config)	Model to evaluate
--verbose	bool	false	Show detailed output
--timeout	duration		Per-task timeout
--matrix	bool	false	Run comparison matrix
--strategy	string		Evaluation strategy
--capture	bool	false	Capture request/response for replay
--session	string		Session name for grouping results

Other commands

Utilities

dollama reset-active

Reset the stuck active request counter. Useful if the counter gets out of sync after a crash. Calls POST /v1/ledger/reset-active.

dollama install-app

Install a launcher entry: macOS .app bundle (Spotlight/Launchpad + login agent), Linux .desktop file, or Windows Start Menu shortcut + login startup. Pass --no-login to skip auto-start at sign-in.

dollama uninstall

Remove dollama binary and config directory.

Environment

Environment variables

Variable	Description
ANTHROPIC_BASE_URL	Override the Anthropic API base URL. Set to `http://localhost:11435` to route through dollama.
ANTHROPIC_API_KEY	Anthropic API key. Set to `dollama-proxy` when using the local proxy.
ANTHROPIC_MODEL	Override the model name sent to the API.
DOLLAMA_DEBUG	Enable debug logging (set to any value).
DOLLAMA_DEBUG_UNSAFE	Enable unsafe debug features (set to "1").
DOLLAMA_CAPTURE	Enable request capture mode for analysis.
DOLLAMA_MITIGATIONS	Enable mitigations (default: enabled; set to "0" to disable).
HOME	Home directory — config lives at `$HOME/.dollama/config.toml`.

File locations

Where things live

~/.dollama/config.toml Main configuration file (all settings)
~/.dollama/benchmark.json Benchmark results from dollama benchmark
~/.dollama/eval.json Evaluation results from dollama eval