Clone
A persistent user model that sits between humans and AI agents — predicting what the user would type so agents can talk to Clone instead of interrupting the human.
What is Clone?
Today every prompt requires the human in the loop. Clone breaks that loop: agents address a calibrated user model — your Clone — and the human is only paged when Clone isn't confident.
The system is three layers stacked under one account:
- Recording — every event the user generates (computer-use frames, terminal turns, agent prompts and responses, integration webhooks) lands here as an idempotent
CloneEvent. - Memory — the Recording stream is distilled into a four-tier store: a singleton
UserProfile, atomicSemanticMemoryfacts, time-boundedEpisodicMemorysummaries, andRawMemoryrows. Promotion between tiers is LLM-driven and runs against the user's own data. - Prediction — given an agent's prompt, Clone assembles a context bundle from Memory, calls Anthropic with a prediction-shaped system prompt, and returns top-K candidate replies with calibrated
confidence. The server marks the resultautoif the top candidate clears the caller'sthreshold, otherwiseescalated.
The endpoint shape is deliberately small. There is exactly one headline call — POST /api/predictions/predict/ — and the rest of the surface is supporting CRUD. See Architecture for the full picture.
Quick Links
| 🚀 Quickstart | Get your first prediction in 5 minutes |
| 🏗️ Architecture | The Recording → Memory → Prediction layer model |
| 🔌 MCP Server | Drop Clone into Claude Code, Cursor, Claude Desktop, or any MCP-aware client |
| 📚 API Reference | REST endpoints, auth, error codes |
| 🧬 Schema | CloneEvent JSON Schema and validation rules |
| 🛠️ Self-hosting | docker-compose, DNS, TLS, Smithery publish |
| 🧑💻 Development | Local setup, testing, contributing |
| ❓ FAQ & Troubleshooting | DNS NXDOMAIN, schema validation, prediction 504s, cold-start confidence |
Key features
- Calibrated, not bolted-on. Predictions ship with a numeric
confidence(post-Platt scaling) plus the originalraw_confidenceand anauto/escalateddecision driven by the caller'sthreshold. The dailyfit_calibrationcron retrains the per-user sigmoid on actualaccepted/rejected/editedoutcomes; behavioral decay (atomic, flip-aware) updates fact importance on every feedback. Clients build automation (auto-respond when confident) or autocomplete (rank suggestions) on the same primitive. - One identity boundary, many surfaces. A single Django service (
apps/server) owns Recording, Memory, Prediction, Voice, Sources, and Access. Clients —apps/web,apps/desktop,apps/mcp,apps/cli— are thin views; no business state lives outside the server. - Idempotent ingest. The Recording layer is keyed by per-event
id. Re-posting an event with an existing id is a no-op, so producers retry freely. - One schema, two languages.
packages/schema/events.ts(TypeScript) andpackages/schema/events.schema.json(Python validator) describe the sameCloneEventshape and move together. - MCP-native. The MCP server (
apps/mcp) exposes 7 tools —predict_next_prompt,predict_continuation,submit_feedback,start_session,stop_session,record_agent_prompt,record_agent_response— over both stdio (single-tenant local installs) and Streamable HTTP (multi-tenant production atclone.is/mcp). - Self-hostable end-to-end. One
docker-compose.ymlbrings up Postgres, the Django API, the MCP server, and the nginx-fronted web container that also serves these docs at/docs/.
Trying Clone in 30 seconds
# Issue an API key from the Clone dashboard, then:
export CLONE_API_TOKEN="clone_xxxxxxxxxxxxxxxxxxxxxxxxxxxx"
export CLONE_API_URL="https://api.clone.is"
curl -sS -X POST "$CLONE_API_URL/api/predictions/predict/" \
-H "X-Clone-API-Key: $CLONE_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"agent":"Claude Code","agent_input":"Test finished. What next?","k":3,"threshold":0.8}'
The response carries predicted_response, ranked candidates, calibrated confidence, and status (auto or escalated). Cold-start predictions land in the 0.3–0.5 band; confidence rises as Memory accumulates real history.
The full walkthrough — including how to seed Recording data, install the MCP server in Claude Code, and call the API from any language — lives in the Quickstart.
Where Clone lives
| Surface | Path | Role |
|---|---|---|
| Web — marketing + dashboard | apps/web | Edit Memory, view analytics, manage API keys |
| Server — Django + DRF + Anthropic | apps/server | Recording, Memory, Prediction, Voice, Access |
MCP — @modelcontextprotocol/sdk facade | apps/mcp | Predict-next-prompt for any MCP-aware client |
| Desktop — Python + ocap recorder | apps/desktop | Capture events, stream MCAP/MKV append-slabs to the server |
CLI — Python + prompt_toolkit | apps/cli | Terminal client and recording producer |
Cross-stack contracts live in packages/schema (the SSOT) and brand assets in packages/design.
For LLMs and coding agents
Machine-readable entry points to this documentation, generated fresh on every deploy:
/docs/llms.txt— curated index of every page with short descriptions. Safe to load wholesale into an LLM context./docs/llms-full.txt— every page concatenated into a single markdown stream for one-shot ingestion.
If your agent is calling Clone over MCP rather than over REST, see MCP Server reference — it's the canonical guide for predict_next_prompt, submit_feedback, and the session / recording helpers.