Back to dollama.net

CLI Reference

Every command, flag, and config option for the dollama CLI.

The three ways to run dollama

dollama connect

Start as a local proxy that forwards LLM requests to the network. Your coding tools talk to localhost:11435 and the proxy handles routing, auth, and streaming.

FlagTypeDefaultDescription
-p, --portint11435Local proxy listen port
--modelstring(relay)Model to route requests to
--no-trayboolfalseRun without system tray (headless)
--verboseboolfalseEnable verbose debug logging
dollama serve

Start as a compute contributor, sharing your local Ollama instance with the network. Connects to the relay via WebSocket and processes inference requests.

FlagTypeDefaultDescription
--ollama-urlstringhttp://localhost:11434Local Ollama API URL
--modelstring(relay)Model to advertise
--no-trayboolfalseRun without system tray (headless)
--verboseboolfalseEnable verbose debug logging
--max-contextint0Max context length (0 = default 65536)
dollama both

Combined mode: runs both the local proxy and the compute contributor simultaneously. Use the network and share your idle compute.

FlagTypeDefaultDescription
-p, --portint11435Local proxy listen port
--ollama-urlstringhttp://localhost:11434Local Ollama API URL
--modelstring(relay)Model to advertise
--no-trayboolfalseRun without system tray (headless)
--verboseboolfalseEnable verbose debug logging
dollama app

Launch the system tray application with a web dashboard for configuration and control. This is the default when running dollama with no arguments.

One-command tool integration

dollama launch claude

Start the local proxy, set ANTHROPIC_BASE_URL, and launch Claude Code automatically. The fastest way to get started.

FlagTypeDefaultDescription
-p, --portint11435Local proxy listen port
--modelstring(relay)Model to use
--opusboolfalseMap claude-opus requests to network model
--sonnetboolfalseMap claude-sonnet requests to network model
--haikuboolfalseMap claude-haiku requests to network model

Config commands & file reference

dollama config

Show current configuration with all fields, values, and defaults.

dollama config path

Print the config file path (typically ~/.dollama/config.toml).

dollama config reference

Print a template with all config fields documented.

Config file reference

All settings live in ~/.dollama/config.toml. CLI flags override config values.

Core
KeyTypeDefaultDescription
relay_urlstringhttps://api.dollama.netRelay server URL
ollama_urlstringhttp://localhost:11434Local Ollama API URL
modelstringqwen3.5:9bDefault model for inference and serving
listen_portint11435Proxy listen port (connect mode)
dashboard_portint11436Web dashboard port
modestring""Active mode: connect, serve, both, or "" (idle)
Auth
KeyTypeDefaultDescription
relay_tokenstring""API token (auto-generated on first run)
relay_node_idstring""Node UUID from registration
relay_secretstring""Node secret from registration
user_idstring""Stable user ID from relay
github_loginstring""GitHub username (set via dollama login)
Model Aliases
KeyTypeDefaultDescription
opus_modelstring""Model for claude-opus requests
sonnet_modelstring""Model for claude-sonnet requests
haiku_modelstring""Model for claude-haiku requests
Behavior
KeyTypeDefaultDescription
auto_updatebooltrueAuto-update when new versions are available
battery_modestringpause_serveOn battery: pause_serve, keep_running, or stop_all
serve_num_ctxint0Override Ollama num_ctx per request (0 = model default)
languagestring(system)Response language (e.g. "English")
Optimization
KeyTypeDefaultDescription
optimize_enabledbooltrueMaster switch for context optimization
optimize_tools_enabledbooltrueEnable tool filtering
optimize_tools_strip_descriptionsbooltrueStrip tool descriptions
optimize_tools_simplify_schemasbooltrueSimplify tool schemas
optimize_tools_max_toolsint0Max tools to keep (0 = unlimited)
optimize_context_enabledbooltrueEnable context truncation
optimize_context_max_tokensint6000Max context tokens
optimize_context_truncate_tool_resultsint4000Truncate tool result tokens
optimize_context_prune_stalebooltruePrune stale messages
optimize_format_enabledbooltrueEnable format optimization (markdown, whitespace, etc.)
Terms
KeyTypeDefaultDescription
terms_accepted_useboolfalseAccepted terms for connect (use) mode
terms_accepted_giveboolfalseAccepted terms for serve (give) mode

Login, logout, identity

dollama login

Authenticate with GitHub using the Device Authorization Flow. Opens a browser to github.com/login/device where you enter a one-time code. Links your GitHub account to your dollama identity.

dollama logout

Clear your authentication token. Optionally revokes the token server-side via POST /v1/auth/token/revoke.

dollama whoami

Display the currently authenticated GitHub account and user ID.

Network health

dollama status

Display the current state of the network including online nodes, capacity, and supported models. Calls GET /v1/status under the hood.

Diagnose, test, trace

dollama debug trace

Send a minimal test request and trace its lifecycle through the network step-by-step. Useful for diagnosing why requests fail silently.

FlagTypeDefaultDescription
--relay-urlstring(config)Relay URL
--tokenstring(config)Authentication token
-V, --verboseboolfalseShow raw SSE frames
dollama debug smoke

Post-deploy smoke test: deep health check, node count, trace request, network health. Exits 0 on success, 1 on failure.

FlagTypeDefaultDescription
--relay-urlstring(config)Relay URL
--tokenstring(config)Authentication token
--timeoutduration60sOverall smoke test timeout
dollama debug simulate-node

Simulate fake compute nodes that connect via WebSocket and complete tasks with synthetic responses. No real Ollama needed.

FlagTypeDefaultDescription
--countint1Number of simulated nodes
--delayduration500msSimulated inference delay
--tokensint20Simulated output token count
--fail-ratefloat0Fraction of requests that error (0.0-1.0)
--timeout-ratefloat0Fraction of requests that timeout (0.0-1.0)
--modelstring(relay)Model to register as
--relay-urlstring(config)Relay URL
--tokenstring(config)Authentication token
dollama debug load-test

Send concurrent test requests to measure end-to-end performance including routing, queue wait, and streaming latency.

FlagTypeDefaultDescription
--concurrentint5Number of parallel requests
--countint20Total requests to send
--timeoutduration120sPer-request timeout
--relay-urlstring(config)Relay URL
--tokenstring(config)Authentication token
dollama debug terrarium

Run full-stack integration tests (relay terrarium + CLI E2E). Spins up an in-process test server and exercises the complete data path.

FlagTypeDefaultDescription
--scenariosstringsallScenarios: happy-path, concurrent, node-failures, node-timeouts, node-disconnect, zero-nodes, burst-drain, sse-passthrough, idle-timeout, or all
--suitestringallTest suite: relay, e2e, or all
--replay-casestringsReplay fixture case(s) for capture-replay tests
--verboseboolfalsePass -v to go test
--shortboolfalseSkip slow tests (idle-timeout)
--timeoutduration5mOverall test timeout
dollama debug tools

Test whether a model supports function calling (tool use) by sending a minimal tool-calling request to local Ollama.

FlagTypeDefaultDescription
--ollama-urlstringhttp://localhost:11434Local Ollama API URL
--modelstring(config)Model to test
-V, --verboseboolfalseShow detailed response content
dollama debug compress

Apply aggressive context compression to a captured Claude Code request JSON. Useful for analyzing compression effectiveness.

FlagTypeDefaultDescription
--statsboolfalsePrint size report only, no JSON output
-o, --outputstring""Write compressed JSON to file
--modelstring""Override model name

Measure your hardware

dollama benchmark

Run a benchmark suite against your local Ollama instance to measure TPS, max context length, and concurrency limits. Results saved to ~/.dollama/benchmark.json.

FlagTypeDefaultDescription
--ollama-urlstringhttp://localhost:11434Local Ollama API URL
--modelstring(config)Model to benchmark
--max-contextint0Max context length (0 = default 65536)
--conversationboolfalseRun conversation degradation benchmark
--turnsint0Number of conversation turns (default 20)
dollama eval

Run agent workload evaluation against your local Ollama. Tests real-world coding scenarios to measure model capability.

FlagTypeDefaultDescription
--ollama-urlstringhttp://localhost:11434Local Ollama API URL
--modelstring(config)Model to evaluate
--verboseboolfalseShow detailed output
--timeoutdurationPer-task timeout
--matrixboolfalseRun comparison matrix
--strategystringEvaluation strategy
--captureboolfalseCapture request/response for replay
--sessionstringSession name for grouping results

Utilities

dollama reset-active

Reset the stuck active request counter. Useful if the counter gets out of sync after a crash. Calls POST /v1/ledger/reset-active.

dollama install-app

Create a macOS .app bundle for the system tray application.

dollama uninstall

Remove dollama binary and config directory.

Environment variables

VariableDescription
ANTHROPIC_BASE_URLOverride the Anthropic API base URL. Set to http://localhost:11435 to route through dollama.
ANTHROPIC_API_KEYAnthropic API key. Set to dollama-proxy when using the local proxy.
ANTHROPIC_MODELOverride the model name sent to the API.
DOLLAMA_DEBUGEnable debug logging (set to any value).
DOLLAMA_DEBUG_UNSAFEEnable unsafe debug features (set to "1").
DOLLAMA_CAPTUREEnable request capture mode for analysis.
DOLLAMA_MITIGATIONSEnable mitigations (default: enabled; set to "0" to disable).
HOMEHome directory — config lives at $HOME/.dollama/config.toml.

Where things live

  • ~/.dollama/config.toml Main configuration file (all settings)
  • ~/.dollama/benchmark.json Benchmark results from dollama benchmark
  • ~/.dollama/eval.json Evaluation results from dollama eval