Skip to main content

Clone

A persistent user model that sits between humans and AI agents — predicting what the user would type so agents can talk to Clone instead of interrupting the human.

What is Clone?

Today every prompt requires the human in the loop. Clone breaks that loop: agents address a calibrated user model — your Clone — and the human is only paged when Clone isn't confident.

The system is three layers stacked under one account:

  1. Recording — every event the user generates (computer-use frames, terminal turns, agent prompts and responses, integration webhooks) lands here as an idempotent CloneEvent.
  2. Memory — the Recording stream is distilled into a four-tier store: a singleton UserProfile, atomic SemanticMemory facts, time-bounded EpisodicMemory summaries, and RawMemory rows. Promotion between tiers is LLM-driven and runs against the user's own data.
  3. Prediction — given an agent's prompt, Clone assembles a context bundle from Memory, calls Anthropic with a prediction-shaped system prompt, and returns top-K candidate replies with calibrated confidence. The server marks the result auto if the top candidate clears the caller's threshold, otherwise escalated.

The endpoint shape is deliberately small. There is exactly one headline call — POST /api/predictions/predict/ — and the rest of the surface is supporting CRUD. See Architecture for the full picture.

🚀 QuickstartGet your first prediction in 5 minutes
🏗️ ArchitectureThe Recording → Memory → Prediction layer model
🔌 MCP ServerDrop Clone into Claude Code, Cursor, Claude Desktop, or any MCP-aware client
📚 API ReferenceREST endpoints, auth, error codes
🧬 SchemaCloneEvent JSON Schema and validation rules
🛠️ Self-hostingdocker-compose, DNS, TLS, Smithery publish
🧑‍💻 DevelopmentLocal setup, testing, contributing
FAQ & TroubleshootingDNS NXDOMAIN, schema validation, prediction 504s, cold-start confidence

Key features

  • Calibrated, not bolted-on. Predictions ship with a numeric confidence (post-Platt scaling) plus the original raw_confidence and an auto/escalated decision driven by the caller's threshold. The daily fit_calibration cron retrains the per-user sigmoid on actual accepted / rejected / edited outcomes; behavioral decay (atomic, flip-aware) updates fact importance on every feedback. Clients build automation (auto-respond when confident) or autocomplete (rank suggestions) on the same primitive.
  • One identity boundary, many surfaces. A single Django service (apps/server) owns Recording, Memory, Prediction, Voice, Sources, and Access. Clients — apps/web, apps/desktop, apps/mcp, apps/cli — are thin views; no business state lives outside the server.
  • Idempotent ingest. The Recording layer is keyed by per-event id. Re-posting an event with an existing id is a no-op, so producers retry freely.
  • One schema, two languages. packages/schema/events.ts (TypeScript) and packages/schema/events.schema.json (Python validator) describe the same CloneEvent shape and move together.
  • MCP-native. The MCP server (apps/mcp) exposes 7 tools — predict_next_prompt, predict_continuation, submit_feedback, start_session, stop_session, record_agent_prompt, record_agent_response — over both stdio (single-tenant local installs) and Streamable HTTP (multi-tenant production at clone.is/mcp).
  • Self-hostable end-to-end. One docker-compose.yml brings up Postgres, the Django API, the MCP server, and the nginx-fronted web container that also serves these docs at /docs/.

Trying Clone in 30 seconds

# Issue an API key from the Clone dashboard, then:
export CLONE_API_TOKEN="clone_xxxxxxxxxxxxxxxxxxxxxxxxxxxx"
export CLONE_API_URL="https://api.clone.is"

curl -sS -X POST "$CLONE_API_URL/api/predictions/predict/" \
-H "X-Clone-API-Key: $CLONE_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"agent":"Claude Code","agent_input":"Test finished. What next?","k":3,"threshold":0.8}'

The response carries predicted_response, ranked candidates, calibrated confidence, and status (auto or escalated). Cold-start predictions land in the 0.3–0.5 band; confidence rises as Memory accumulates real history.

The full walkthrough — including how to seed Recording data, install the MCP server in Claude Code, and call the API from any language — lives in the Quickstart.

Where Clone lives

SurfacePathRole
Web — marketing + dashboardapps/webEdit Memory, view analytics, manage API keys
Server — Django + DRF + Anthropicapps/serverRecording, Memory, Prediction, Voice, Access
MCP — @modelcontextprotocol/sdk facadeapps/mcpPredict-next-prompt for any MCP-aware client
Desktop — Python + ocap recorderapps/desktopCapture events, stream MCAP/MKV append-slabs to the server
CLI — Python + prompt_toolkitapps/cliTerminal client and recording producer

Cross-stack contracts live in packages/schema (the SSOT) and brand assets in packages/design.

For LLMs and coding agents

Machine-readable entry points to this documentation, generated fresh on every deploy:

  • /docs/llms.txt — curated index of every page with short descriptions. Safe to load wholesale into an LLM context.
  • /docs/llms-full.txt — every page concatenated into a single markdown stream for one-shot ingestion.

If your agent is calling Clone over MCP rather than over REST, see MCP Server reference — it's the canonical guide for predict_next_prompt, submit_feedback, and the session / recording helpers.