# Clone

> A persistent user model that predicts what you will type

This file contains all documentation content in a single document following the llmstxt.org standard.

## Authentication


Every Clone endpoint requires `IsAuthenticated`. Two credential shapes are accepted, both detected automatically by the server.

## Credential shapes

| Shape | Header | Lifetime | Used by |
|---|---|---|---|
| JWT | `Authorization: Bearer <access>` | 60 minutes (refreshable) | Web, Desktop, browser dashboards. |
| API key | `X-Clone-API-Key: clone_<token>` | Long-lived (rotate manually) | CLI, MCP server (stdio), CI, scripts. |

The MCP server (`apps/mcp/src/index.ts`) chooses the right header automatically based on the token shape — values starting with `clone_` go in `X-Clone-API-Key`, anything else is sent as `Authorization: Bearer …`.

## Endpoints (`/api/auth/`)

| Method | Path | Purpose |
|---|---|---|
| POST | `/api/auth/signup/` | Create a new account. |
| POST | `/api/auth/login/` | Log in with email + password. Returns `{ access, refresh, user }`. |
| POST | `/api/auth/logout/` | Invalidate the current refresh token. |
| POST | `/api/auth/token/refresh/` | Exchange a refresh token for a fresh access token. |
| GET | `/api/auth/me/` | Return the authenticated user. |
| POST | `/api/auth/invite/validate/` | Validate an invite code before signup. |
| POST | `/api/auth/waitlist/` | Add an email to the public waitlist. |
| GET / POST | `/api/auth/keys/` | List the user's API keys / issue a new one. |
| GET / DELETE | `/api/auth/keys/<uuid>/` | Inspect or revoke a specific API key. |

## JWT flow

```bash
# 1. Login.
curl -sS -X POST https://api.clone.is/api/auth/login/ \
  -H "Content-Type: application/json" \
  -d '{"email":"you@example.com","password":"…"}'
# → { "access": "<jwt>", "refresh": "<jwt>", "user": {...} }

# 2. Use the access token. (60-min lifetime.)
curl -sS https://api.clone.is/api/auth/me/ \
  -H "Authorization: Bearer <access>"

# 3. When access expires, exchange the refresh.
curl -sS -X POST https://api.clone.is/api/auth/token/refresh/ \
  -H "Content-Type: application/json" \
  -d '{"refresh":"<refresh>"}'
# → { "access": "<new jwt>" }
```

Browser surfaces (Web, Desktop) refresh access tokens on a 50-minute timer so the user never sees a forced re-login. Servers and headless scripts should use API keys instead.

## API-key flow

```bash
# 1. Issue a key.
curl -sS -X POST https://api.clone.is/api/auth/keys/ \
  -H "Authorization: Bearer <access>" \
  -H "Content-Type: application/json" \
  -d '{"label":"my-laptop"}'
# → { "id": "<uuid>", "key": "clone_xxxxxxxxxxxxxx", "label": "my-laptop", "created_at": "..." }
#
# IMPORTANT: the raw key is only returned once. Copy it now or revoke and re-issue.

# 2. Use it.
curl -sS https://api.clone.is/api/auth/me/ \
  -H "X-Clone-API-Key: clone_xxxxxxxxxxxxxx"

# 3. Revoke it later.
curl -sS -X DELETE https://api.clone.is/api/auth/keys/<uuid>/ \
  -H "Authorization: Bearer <access>"
```

## Multi-tenant MCP

When the MCP server runs in `MCP_TRANSPORT=http` (production at `clone.is/mcp`), each MCP session is bound to whatever credential the client provided on its `initialize` request. One MCP instance therefore serves many users — the server forwards the per-session token upstream rather than holding a shared one. See [Apps → MCP](../apps/mcp).

## What goes wrong

| Symptom | Cause |
|---|---|
| `401 Unauthorized` | Missing header, expired access token, revoked API key. |
| `401` immediately after token issue | Clock skew on the producing host (JWT `iat`/`exp` rely on accurate time). |
| Endpoint works in browser but `401` from MCP | Browser session uses cookies; MCP needs an explicit `X-Clone-API-Key` or `Authorization: Bearer`. |

---

## Memory


`/api/memory/*` exposes Clone's hierarchical Memory layer. Four tiers (`profile → facts → episodes → raw`) plus the headline `/context/` endpoint that bundles them for the Prediction layer. All endpoints are scoped to `request.user`.

## Endpoints

| Method | Path | Purpose |
|---|---|---|
| GET / POST | `/api/memory/raw/` | List or append `RawMemory` rows. |
| GET / DELETE | `/api/memory/raw/<uuid>/` | Single row. |
| GET / POST | `/api/memory/episodes/` | List or append `EpisodicMemory` rows. |
| GET / PATCH / DELETE | `/api/memory/episodes/<uuid>/` | Single episode. |
| GET / POST | `/api/memory/facts/` | List or append `SemanticMemory` facts. |
| GET / PATCH / DELETE | `/api/memory/facts/<uuid>/` | Single fact. |
| GET / POST / DELETE | `/api/memory/profile/` | Read, upsert, or delete the singleton `UserProfile`. |
| POST | `/api/memory/context/` | Assemble the layered bundle the Prediction layer consumes. |
| POST | `/api/memory/sync/` | Derive `RawMemory` rows from existing `RecordingEvent` rows. Idempotent. |
| GET | `/api/memory/stats/` | Counts and last-update timestamps per tier. |
| POST | `/api/memory/promote/episodes/` | Cluster recent un-summarized raw rows into episode drafts (LLM). |
| POST | `/api/memory/promote/facts/` | Distill fact rows from recent episodes (LLM). |

## The headline call — `POST /context/`

```bash
curl -sS -X POST https://api.clone.is/api/memory/context/ \
  -H "X-Clone-API-Key: $CLONE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "recency_minutes": 60,
    "max_facts": 20,
    "max_episodes": 10,
    "max_raw": 50
  }'
```

Body fields (all optional):

| Field | Default | Range | Notes |
|---|---|---|---|
| `goal` | — | string | Substring filter on fact `text`. |
| `tags` | — | string[] | Facts must contain at least one tag. |
| `recency_minutes` | `60` | 1–10080 | Window for episodes and raw items. |
| `max_facts` | `20` | 1–100 | Cap on facts returned. |
| `max_episodes` | `10` | 1–50 | Cap on episodes returned. |
| `max_raw` | `50` | 1–500 | Cap on raw items returned. |

Returns `{ profile, facts, episodes, raw, meta }`. The Prediction layer calls this in-process, so client-issued requests get the same bundle the LLM would see.

## Promotion pipeline

`POST /api/memory/promote/episodes/` clusters recent un-summarized raw rows into draft episodes via Anthropic. `POST /api/memory/promote/facts/` distills atomic facts from recent episodes. Both accept `since`, `limit`, `model` (overrides default `claude-sonnet-4-6`), and `force` (bypass the "already promoted" filter). Errors map the same way the Prediction layer does — `503` on missing/invalid Anthropic key, `429` on rate limit, `502` on upstream non-2xx.

## Stats

```bash
curl -sS https://api.clone.is/api/memory/stats/ \
  -H "X-Clone-API-Key: $CLONE_API_TOKEN"
```

Returns counts and last-update timestamps per tier; takes optional `?since=` to scope to a window. Used by the Web dashboard's Memory page.

---

## API Reference


Clone exposes one REST surface, served at `https://api.clone.is` in production and at `http://localhost:8001` in development. Every endpoint lives under one of these prefixes:

| Prefix | What it owns | Page |
|---|---|---|
| `/api/auth/` | Signup, login, JWT refresh, API keys, invite, waitlist, `me/`. | [Authentication](./authentication) |
| `/api/recording/` | Idempotent event ingest; sessions and event detail. | [Recording](./recording) |
| `/api/memory/` | Profile, semantic facts, episodes, raw memory; promotion pipeline; the headline `/context/` bundle. | [Memory](./memory) |
| `/api/predictions/` | Predict, batch-predict, feedback, list, stats. | [Predictions](./predictions) |
| `/api/access/` | Per-account members, invite/approve/decline. | _Reference page TBD_ |
| `/api/voice/` | ElevenLabs voice synthesis. | _Reference page TBD_ |
| `/api/sources/` | Source connectors (per-integration adapter scaffolding). | _Reference page TBD_ |
| `/api/news/` | Curated news feed for the marketing site. | _Reference page TBD_ |
| `/api/monitoring/` | Read-only health and stats endpoints. | _Reference page TBD_ |
| `/api/jobs/` | Background-job records (one row per LLM call etc.). | _Reference page TBD_ |
| `/admin/` | Django admin (staff only). | _N/A_ |

Authentication, Recording, Memory, and Predictions have detailed reference pages linked above. The remaining surfaces are live but not yet documented in detail here.

## Conventions

- **Auth.** Every endpoint requires `IsAuthenticated`. See [Authentication](./authentication) for the two credential shapes (`Authorization: Bearer …` JWT or `X-Clone-API-Key: clone_…` API key).
- **Content type.** Request bodies are JSON. Responses are JSON. Empty responses use `204 No Content`.
- **Datetimes.** All datetimes are ISO-8601 in UTC. Query params like `since=` and `until=` accept the same format.
- **Pagination.** List endpoints use `limit` (default and max documented per endpoint) and `offset`. There is no cursor pagination today.
- **Error shape.** Errors return `{ "error": "<message>" }` with the matching HTTP status; per-item batch errors return alongside the success array (e.g. `recording/events/`'s `invalid` list).

## Status codes you'll see

| Code | When |
|---|---|
| `200 OK` | Read endpoints. |
| `201 Created` | Successful POST that persisted a new row (recording ingest, prediction, fact). |
| `204 No Content` | Successful DELETE. |
| `400 Bad Request` | Validation error in the request body. |
| `401 Unauthorized` | Missing or invalid credential. |
| `404 Not Found` | Row does not exist or does not belong to `request.user`. |
| `429 Too Many Requests` | Anthropic rate-limited the prediction call. |
| `502 Bad Gateway` | Anthropic returned a non-2xx upstream. |
| `503 Service Unavailable` | LLM key missing, Anthropic unreachable, or invalid Anthropic key. |

---

## Predictions


`/api/predictions/*` is Clone's prediction surface. The headline call (`POST /predict/`) takes an agent prompt, assembles a Memory context bundle in-process, calls Anthropic with the prediction system prompt, and returns top-K candidates with calibrated confidence and an `auto`/`escalated` decision.

## Endpoints

| Method | Path | Purpose |
|---|---|---|
| POST | `/api/predictions/predict/` | Predict the user's reply for a single agent prompt. |
| POST | `/api/predictions/predict/batch/` | Predict for up to 100 prompts in one call (shares one Memory bundle). |
| POST | `/api/predictions/<uuid>/feedback/` | Record the actual reply + final status (`accepted` / `edited` / `rejected`). |
| GET | `/api/predictions/<uuid>/` | Fetch one prediction with full payload (candidates, usage, context snapshot). |
| GET | `/api/predictions/` | List recent predictions. Filter by `status`, `agent`, `session_id`, `since`, `until`. Paginate with `limit` (default 50, max 200) + `offset`. |
| GET | `/api/predictions/stats/` | Aggregate `automation_rate` + `precision`. |

## Predict — request

```bash
curl -sS -X POST https://api.clone.is/api/predictions/predict/ \
  -H "X-Clone-API-Key: $CLONE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "agent": "Claude Code",
    "agent_input": "Test finished. What next?",
    "k": 3,
    "threshold": 0.8,
    "recency_minutes": 60,
    "session_id": "demo-1"
  }'
```

| Field | Type | Required | Default | Range |
|---|---|---|---|---|
| `agent` | string | yes | — | — |
| `agent_input` | string | yes | — | — |
| `k` | integer | no | `1` | 1–`MAX_K` |
| `threshold` | number | no | `0.8` | 0–1 |
| `model` | string | no | `claude-sonnet-4-6` | One of `SUPPORTED_MODELS` |
| `recency_minutes` | integer | no | `60` | 1–10080 |
| `session_id` | string | no | — | — |

## Predict — response

```json
{
  "id": "<uuid>",
  "predicted_response": "...",
  "confidence": 0.42,
  "reasoning": "...",
  "candidates": [ {"response": "...", "confidence": 0.42, "reasoning": "..."}, ... ],
  "k": 3,
  "status": "escalated",
  "threshold": 0.8,
  "model": "claude-sonnet-4-6",
  "latency_ms": 4827,
  "usage": { "input_tokens": 1234, "output_tokens": 56 }
}
```

`status` is server-decided from `candidates[0].confidence ≥ threshold` (`auto`) or otherwise (`escalated`). On `auto`, the server also sets `acted_at` so the prediction is treated as having been honored.

## Feedback

```bash
curl -sS -X POST https://api.clone.is/api/predictions/<id>/feedback/ \
  -H "X-Clone-API-Key: $CLONE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"status":"edited","actual_response":"…what the user actually typed…"}'
```

`status` must be one of `accepted`, `edited`, `rejected`.

- `accepted`: user used Clone's prediction as-is. `actual_response` defaults to the prediction.
- `edited`: user typed something different. `actual_response` is **required**.
- `rejected`: user discarded without replacement. `actual_response` optional.

## Stats

```json
{
  "total":            1234,
  "by_status":        { "pending": 0, "auto": 700, "escalated": 200, "accepted": 100, "edited": 50, "rejected": 184 },
  "precision":        0.84,        // (auto + accepted) / answered
  "automation_rate":  0.84         // answered / total
}
```

Optional `?since=` and `?until=` ISO-8601 query params restrict the window — useful for week-over-week dashboards. `precision` and `automation_rate` are `null` when their denominator is zero rather than dividing by zero.

---

## Recording


`/api/recording/*` is the ingest surface for Clone's Recording layer. Producers push `CloneEvent` rows here; the server validates them against `packages/schema/events.schema.json`, persists them under a `RecordingSession` keyed by `session_id`, and is fully idempotent on the per-event `id`.

## Endpoints

| Method | Path | Purpose |
|---|---|---|
| POST | `/api/recording/events/` | Batch ingest. Body is a JSON array (1–1000 items). |
| GET | `/api/recording/events/<id>/` | Fetch a single event with full payload. |
| GET | `/api/recording/sessions/` | List the requester's 50 most recent sessions. |
| GET | `/api/recording/sessions/<id>/events/` | List events for a session, oldest first. `?limit=` (default 500, max 2000). |

## Ingest semantics

- **Idempotent on `id`.** Re-posting the same `id` returns a `duplicates` count; the row is unchanged. Producers can retry freely.
- **Per-item validation.** Each event in the batch is validated independently. The endpoint returns `accepted`, `duplicates`, and an `invalid: [{ index, errors }]` array; one bad row does not fail the batch.
- **Session ownership.** The first event for a given `session_id` creates the `RecordingSession` and locks it to `request.user`. Subsequent events from a different user with the same id are rejected as `session belongs to another user`.
- **Session bookends.** A `session.started` event re-anchors the session's `started_at`, `source`, and `source_detail` even if it arrives out of order; a later `session.stopped` event sets `ended_at` if the new timestamp is later than the existing one.

## Example

```bash
curl -sS -X POST https://api.clone.is/api/recording/events/ \
  -H "X-Clone-API-Key: $CLONE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '[
    {
      "id": "sess-1-start",
      "session_id": "demo-1",
      "occurred_at": "2026-05-05T12:00:00Z",
      "source": "cli",
      "type": "session.started"
    },
    {
      "id": "sess-1-prompt",
      "session_id": "demo-1",
      "occurred_at": "2026-05-05T12:00:05Z",
      "source": "agent",
      "source_detail": "claude-code",
      "type": "agent.prompt",
      "agent": "Claude Code",
      "prompt": "Run the test suite."
    }
  ]'
```

## Response shape

```json
{
  "accepted":   2,
  "duplicates": 0,
  "invalid":    []
}
```

`HTTP 201` if any event was newly persisted; `HTTP 200` otherwise. Per-item failures look like:

```json
{ "index": 0, "errors": ["'source' is a required property"] }
```

The first three schema errors per row are surfaced; remaining errors are truncated. The full validation rule set lives in [Schema](../schema).

---

## CLI

# CLI (`apps/cli`)

`apps/cli` is the terminal-first client for Clone. It targets two use cases: a recording producer (so the Recording layer learns how the user actually works in a shell) and a small toolbox of read-only commands against the API for the moments when opening the web dashboard is overkill.

## Stack

- Node (`apps/cli` ships its own `npm/` directory for npm publishing)
- TypeScript

## Run it locally

```bash
cd apps/cli
npm install
npm run build
node dist/index.js --help
```

---

## Desktop

# Desktop (`apps/desktop`)

The Clone Desktop tray app. Its job is to capture computer-use events
(screen, keyboard, mouse, app focus, browser URL) into MCAP/MKV files
on disk and stream them to the Recording layer over a slab-append
upload protocol. It is **recorder + uploader only** — the AI pipelines
that previously lived on the desktop (digest summarizer, suggestion
popup, on-device LLM) moved off the device.

## Stack

- Python 3.11+ (3.13 in the production Nuitka build env)
- Vendored `ocap` CLI (GStreamer + hardware H.265) for capture
- `pystray` (Windows) / `rumps` (macOS) tray
- `customtkinter` Settings window
- `httpx` + `keyring` (DPAPI / Keychain) for auth + uploads
- `truststore` for corporate MITM proxy compatibility
- Nuitka standalone build → Inno Setup (Windows) / signed + notarized
  DMG (macOS)

## Surface split

| Surface | Concern |
|---|---|
| **Desktop** | Record + upload. Tray-only — no main window. Tail recording into the Clone server with crash-recovery semantics. |
| **Web** | Browse, edit, analyze, manage. Memory + Predictions UIs, account settings, billing. |

The desktop client deliberately holds **no business state** beyond the
upload queue (`stream_uploads` in SQLite) and OS-keychain refresh
token; everything else is the server's view.

## Run it locally

```bash
cd apps/desktop
git submodule update --init --recursive
uv venv --python 3.13 && uv pip install -e ".[windows]"   # or .[macos]
uv run python -m desktop                                   # tray loads
uv run python -m pytest tests/                             # unit + loopback
```

## Build a frozen binary

```bash
# Windows
scripts\build_nuitka.bat        # → dist\Clone Desktop\CloneDesktop.exe
                                # → dist\CloneDesktop-Setup-0.1.0.exe

# macOS
bash scripts/build_nuitka.sh    # → dist/Clone Desktop.app + DMG
                                #   (signed + notarized when a Developer ID
                                #    is in the keychain)
```

## Frozen-build self-tests

```bash
"CloneDesktop.exe" --selftest=keyring_backend       # OS keychain round-trip
"CloneDesktop.exe" --selftest=truststore_inject     # MITM-proxy CA path
"CloneDesktop.exe" --selftest=upload_loopback       # 5 MB byte-equal smoke
"CloneDesktop.exe" --selftest=gst_init              # Nuitka×D3D11 bisect tool
```

---

## MCP server

# MCP server (`apps/mcp`)

Clone — the user model that sits between you and your AI agents — exposed as an [MCP](https://modelcontextprotocol.io) server. Drop it into Claude Code, Claude Desktop, Cursor, or any MCP-aware client and your agents start talking to your Clone instead of you whenever Clone is confident enough.

The package is intentionally a 1:1 facade over `apps/server`. **No business logic lives here**; every threshold, prompt template, and memory shape is owned by the Django side.

## Tools

| Tool | Purpose |
|---|---|
| `predict_next_prompt` | Top-K ranked candidate replies with calibrated probabilities. Clients implement automation (auto-respond when `confidence ≥ threshold`) or autocomplete (rank suggestions) on top of this. |
| `predict_continuation` | Personalized loop-termination decision for ralph-style self-correcting agents — returns `should_continue: bool` with calibrated confidence so the plugin can stop iterating when the user would already be satisfied. |
| `submit_feedback` | Close the prediction loop — `accepted` / `edited` / `rejected` so Platt calibration and fact decay learn. |
| `start_session` / `stop_session` | Open / close a recording session; `start_session` returns the `session_id` to thread through later calls. |
| `record_agent_prompt` / `record_agent_response` | Push `agent.prompt` / `agent.response` events. UUID + timestamp are auto-generated; the prompt's `event_id` pairs with the response's `in_response_to`. |

Tool-by-tool detail with payload examples lives in [MCP Server reference](../mcp-reference/overview).

## Transports

| Mode | When | Auth |
|---|---|---|
| `stdio` (default) | Local install (Claude Code, Cursor, Claude Desktop). | Single-tenant — `CLONE_API_TOKEN` is required at process start. |
| `http` | Public deployment behind `https://clone.is/mcp`. | Multi-tenant — each MCP session is bound to the JWT carried in `Authorization: Bearer …` on its `initialize` request. |

`apps/mcp/src/index.ts` picks the mode from `MCP_TRANSPORT` (defaults to `stdio`). HTTP mode falls back to `CLONE_API_TOKEN` if a per-request bearer is missing, so the same image can also serve a single-tenant deployment.

## Token shapes

`apps/mcp/src/index.ts` chooses the auth header from the token shape:

- Token starts with `clone_` → `X-Clone-API-Key: clone_…` (long-lived API key).
- Anything else → `Authorization: Bearer <jwt>` (60-min JWT).

## Install — Claude Code (stdio)

```bash
claude mcp add clone -- npx -y @clone/mcp \
  -e CLONE_API_URL=https://api.clone.is \
  -e CLONE_API_TOKEN=clone_xxxxxxxxxxxxxxxxxxxxxxxxxxxx
```

## Install — Smithery (HTTP, hosted at `clone.is/mcp`)

```bash
npx -y @smithery/cli mcp publish "https://clone.is/mcp" -n cloneisyou/clone
```

End users supply their own `CLONE_API_TOKEN` in Smithery's config UI; that token is forwarded as `Authorization: Bearer …` on every request, so one MCP instance serves many users.

## Run it locally

```bash
cd apps/mcp
npm install

# stdio mode (matches Claude Code launch)
export CLONE_API_URL=http://localhost:8001
export CLONE_API_TOKEN=<dev API key>
npm run dev

# http mode (matches production behind clone.is/mcp)
MCP_TRANSPORT=http PORT=3000 npm run dev
# → http://127.0.0.1:3000/mcp + /health
```

Tests: `npm test` runs 16 unit tests across api / server / http transports. End-to-end smoke tests (real Anthropic call) are at `npm run test:e2e` (stdio) and `npm run test:e2e:http` (against a deployed URL).

---

## Apps


Every Clone surface lives under `apps/`. They share the same JWT/API-key identity boundary, the same event schema (`packages/schema`), and the same backend (`apps/server`). Surface-specific behavior — read-only-plus-live-actions on Desktop vs. edit-plus-analytics on Web, terminal-shaped UX on CLI, MCP-native tooling on `apps/mcp` — is the only differentiator.

## At a glance

| App | Stack | Role |
|---|---|---|
| [Web](./web) | Vite + React 19 + Tailwind v4 | Marketing site, dashboard, memory editing, analytics. |
| [Server](./server) | Django 6 + DRF + simplejwt + Anthropic SDK | The single backend; owns Recording, Memory, Prediction, Voice, Access. |
| [MCP](./mcp) | Node + `@modelcontextprotocol/sdk` | MCP-server facade over the Server's Recording/Memory/Prediction layers. |
| [CLI](./cli) | Node | Terminal client and recording producer. |
| [Desktop](./desktop) | Electron + electron-vite + React 19 | Capture + live actions; read-only views into Memory. |

## Boundaries

- **Identity** — every authenticated request carries either a 60-minute simplejwt access token or a `clone_…` API key. Refresh runs on a 50-minute timer in client surfaces; servers never refresh on the user's behalf.
- **Schema** — `packages/schema/events.ts` (TypeScript) and `packages/schema/events.schema.json` (Python validator) describe the same `CloneEvent` wire shape. Add a new event in both files at once.
- **State** — no client surface persists business state of its own. Anything that should outlive a session goes to the Server via the REST endpoints documented under [API Reference](../api-reference/overview).

---

## Server

# Server (`apps/server`)

The single Django service that hosts every business decision. Client surfaces are thin; this is where the real logic, the database, and the LLM calls live.

## Django apps

| App | Purpose | URL prefix |
|---|---|---|
| `recording` | Idempotent ingest of `CloneEvent` rows; sessions and event detail. | `/api/recording/` |
| `memories` | Profile, semantic facts, episodes, raw memory; promotion pipeline. | `/api/memory/` |
| `predictions` | Predict, batch-predict, feedback, list, stats. | `/api/predictions/` |
| `accounts` | Signup, login, JWT refresh, invite/waitlist, API keys. | `/api/auth/` |
| `access` | Per-account member list, invite/approve/decline. | `/api/access/` |
| `voice` | ElevenLabs voice synthesis endpoints. | `/api/voice/` |
| `sources` | Source connectors (per-integration adapter scaffolding). | `/api/sources/` |
| `news` | Curated news feed for the marketing site. | `/api/news/` |
| `monitoring` | Read-only health and stats endpoints. | `/api/monitoring/` |
| `jobs` | Background-job records (one row per LLM call etc.). | `/api/jobs/` |
| `config` | Project URL conf. | _N/A_ |

## Stack

- Django 6 + Django REST Framework
- `djangorestframework-simplejwt` for JWT (60-min access, longer-lived refresh)
- Postgres 16 in production (`DATABASE_URL`), SQLite in development
- `anthropic` Python SDK for prediction and memory promotion
- `jsonschema` for validating `CloneEvent` payloads against `packages/schema/events.schema.json`
- `uv` for dependency management (`pyproject.toml` + `uv.lock`)

## Run it

```bash
cd apps/server
uv sync
uv run python manage.py migrate
uv run python manage.py runserver 0.0.0.0:8001
uv run python manage.py test            # required before merging
```

## Authentication

Every endpoint above requires `IsAuthenticated`. Two credentials are accepted:

- **JWT** — `Authorization: Bearer <access>` (60-min lifetime; refresh via `POST /api/auth/token/refresh/`).
- **API key** — `X-Clone-API-Key: clone_…` (long-lived; rotate from `POST /api/auth/keys/`).

Per-user data is filtered server-side on `request.user` — clients cannot see other users' rows even with a valid token. See [Authentication](../api-reference/authentication).

## Anthropic integration

Prediction (`predictions/llm.py`) and memory promotion (`memories/promotion.py`) are the only paths that call Anthropic. Both share the same `SUPPORTED_MODELS` allow-list and the same DRF error mapping (`APIKeyMissing → 503`, `RateLimitError → 429`, `AuthenticationError → 503`, `APIConnectionError → 503`, `APIStatusError → 502`). The default model is `claude-sonnet-4-6`.

---

## Web

# Web (`apps/web`)

`apps/web` is a Vite + React 19 SPA that does double duty as the marketing site (`Home`, `Product`, `Tech`, `Careers`, `News`, `About`, `Waitlist`) and the authenticated dashboard (`Console` and its sub-views: `Chat`, `Episodic Memories`, `Semantic Memories`, `Sources`, `Monitoring`, `Access Control`). The same nginx container that ships the SPA also reverse-proxies `/api`, `/admin`, `/static`, `/mcp`, and `/docs` to the right upstream — so users only ever touch one origin.

## Stack

- React 19 + TypeScript strict
- react-router 7 (`BrowserRouter`, `Routes`, `Route`, `Link`, `useNavigate`)
- Tailwind v4 with the new optimizer (`@tailwindcss/vite`)
- Vite 8 (alias `@clone/design` → `../../packages/design`; dev `/api` proxy → `localhost:8001`)
- `geist` font

## What lives here

- **Marketing routes** (`Home`, `Product`, `Tech`, `Careers`, `News`, `About`, `Waitlist`) — public, no auth, optimized for first paint and SEO.
- **Console routes** (`/console/...`) — gated by `useAuth()`; every fetch sends `Authorization: Bearer <access>`.
- **Auth flow** — JWT access token + refresh token persist to `localStorage` under key `clone_auth`; the AuthContext refreshes on a 50-minute timer.
- **No global state** — `useAuth()` + per-page `useState` is the established pattern; do not introduce Redux / Zustand here.

## Build and run

```bash
cd apps/web
npm install
npm run dev      # localhost:5173, hot reload
npm run build    # tsc -b && vite build → dist/
npm run lint     # eslint
```

The production Docker image is multi-stage: a Node builder runs `npm run build`, and the final stage is `nginx:alpine` serving `/app/dist` plus the routing rules in `nginx.conf`. See [Self-hosting → docker-compose](../self-hosting/docker-compose).

## Routing inside the SPA

All route definitions live in `src/main.tsx`; there is no `App.tsx` indirection. Marketing pages compose from `components/` (`Navbar`, `Footer`, `CloneStageSequence`, `ProductSwap`, `WaitlistForm`); dashboard pages live in `pages/` and own their own data fetching against `/api/...`.

> Surface contract: Web is the **edit-and-analyze** surface; Desktop is the **view-and-act** surface. Anything that requires text input, search, or longer reads goes here. Live actions (record, transcribe, hand off to an agent) belong on Desktop.

---

## Architecture


Clone is a single Django service (`apps/server`) split into three layers. Every layer is owned by one Django app, persists its own tables under the user boundary, and exposes a small REST surface. Client surfaces (`apps/web`, `apps/desktop`, `apps/cli`, `apps/mcp`) talk to these REST endpoints over JWT or API-key auth — they hold no business logic of their own.

## The three layers

```
             ┌──────────────────────────────────────────┐
             │  Prediction   apps/server/predictions/   │
             │   POST /predict/   POST /predict/batch/  │
             │   GET  /          /<id>/feedback/        │
             │   GET  /stats/                           │
             └────────────▲─────────────────────────────┘
                          │ context bundle (in-process)
             ┌────────────┴─────────────────────────────┐
             │  Memory       apps/server/memories/      │
             │   profile · facts · episodes · raw       │
             │   POST /context/   POST /sync/           │
             │   POST /promote/episodes/  /facts/       │
             └────────────▲─────────────────────────────┘
                          │ derived from
             ┌────────────┴─────────────────────────────┐
             │  Recording    apps/server/recording/     │
             │   POST /events/    GET /sessions/        │
             │   idempotent on event.id                 │
             └──────────────────────────────────────────┘
```

### Recording

Every input that should ever influence Clone's behavior — a desktop frame, a terminal turn, an integration webhook, an agent prompt — lands here as a `CloneEvent` keyed by `id`. `apps/server/recording/views.py` validates the payload against `packages/schema/events.schema.json` (the canonical wire-shape definition) and persists rows under a `RecordingSession`. Re-posting an event with an existing `id` is a no-op, so producers can retry freely.

### Memory

`apps/server/memories/` distills the Recording stream into a four-tier store:

| Tier | Model | What it holds |
|---|---|---|
| Profile | `UserProfile` | A singleton "who is this person" body that reads like a system prompt. |
| Semantic facts | `SemanticMemory` | Atomic, reusable facts (`kind`, `text`, `importance`, `tags`). |
| Episodic | `EpisodicMemory` | Time-bounded summaries of related raw events. |
| Raw | `RawMemory` | One row per Recording event, normalized to a single sentence. |

Promotion (`raw → episodes → facts`) is LLM-driven and runs in `memories/promotion.py`. Reads cluster around `POST /api/memory/context/`, which assembles a single bundle the Prediction layer can hand to the LLM.

### Prediction

`apps/server/predictions/views.py` is the public face. `POST /api/predictions/predict/` takes the agent's prompt, calls Memory in-process for a context bundle, sends a prediction-shaped prompt to Anthropic, applies per-user **Platt scaling** to the LLM's raw confidence, and returns the result with calibrated probabilities. The `auto`/`escalated` decision is made server-side by comparing the calibrated top candidate against the caller's `threshold`. Every call is persisted as a `Prediction` row (raw confidence kept for the fitter), so feedback (`POST /predictions/<id>/feedback/`) can later mark it `accepted`, `edited`, or `rejected` and feed both the calibration training set and the **behavioral decay** loop on cited facts.

`/predict/batch/` shares one Memory bundle across up to 100 prompts; both endpoints return calibrated `confidence` plus `raw_confidence`.

### Calibration (the daily loop)

`apps/server/predictions/calibration.py` fits a sigmoid `σ(a·x + b)` per user via Newton-Raphson on the (raw_confidence, label) pairs from labeled `Prediction` rows — `accepted` = positive, `rejected` = negative, `edited` = half-positive. Pure Python, no numpy/scipy. The daily `fit_calibration` management command writes the result to `UserProfile.calibration` (a JSON column); the predict view reads it on every call. Recency cap: 5000 most-recent labels per fit. Below `MIN_SAMPLES_FOR_FIT = 5`, the user gets identity calibration (raw == calibrated).

### Decay (the silent garbage collector)

`apps/server/memories/decay.py` adjusts `SemanticMemory.importance` whenever a prediction's outcome contradicts or confirms the facts it cited. Asymmetric: `+0.04` per confirmation, `-0.08` per contradiction. The transition is atomic (`select_for_update` inside `transaction.atomic`) and **flip-aware** — flipping `accepted → rejected` reverses the prior `+0.04` and applies `-0.08`. Facts that decay to zero importance are not deleted (provenance preserved); they sort to the bottom of every future context bundle.

## Lifecycle of one prediction

1. Client POSTs `{ agent, agent_input, threshold, k }` to `/api/predictions/predict/`.
2. `predict_view` calls `_build_context` to fetch profile + 20 most-important facts + last 10 episodes + last 50 raw rows for that user.
3. `predictions/llm.py` calls `client.messages.create(...)` with the assembled context and a system prompt that returns top-K candidates with reasoning and confidence.
4. The view writes a `Prediction` row, marks `status='auto'` if `top.confidence >= threshold` else `'escalated'`, and returns the JSON to the client.
5. (Optional) The client calls `/predictions/<id>/feedback/` later with the user's actual reply for evaluation.

## Cross-stack contracts

- **Schema**: `packages/schema/events.schema.json` is the SSOT. Both TypeScript clients (`packages/schema/events.ts`) and the Django server (loaded once at import time in `recording/views.py`) validate against the same JSON Schema. See [Schema](./schema).
- **Auth**: every authenticated request carries either `Authorization: Bearer <jwt>` (60-min access token, refreshed every 50 min) or `X-Clone-API-Key: clone_…` (long-lived API key). Both are accepted by every endpoint listed above. See [Authentication](./api-reference/authentication).
- **Anthropic SDK**: the Prediction and Memory promotion paths are the only places the server talks to Anthropic. Both share the same `SUPPORTED_MODELS` allow-list and the same key-missing/`429`/`401` error mapping.

---

## Changelog


This page tracks documentation-noteworthy changes — new endpoints, schema breaking changes, deployment surface shifts, security posture. Detailed commit-by-commit history lives in the [GitHub log](https://github.com/cloneisyou/clone/commits/main).

## 2026-05-08

Security and reliability hardening sweep. 23 PRs across server / web / desktop / CLI / MCP, surface-by-surface.

**Security defaults**

- `DEBUG` defaults to `False`, `ALLOWED_HOSTS` defaults to `127.0.0.1,localhost`. `apps/server/entrypoint.sh` refuses to boot when `CLONE_ENV=production` AND `DEBUG=True`. ([#87](https://github.com/cloneisyou/clone/pull/87))
- `/api/voice/{tts,stt,clone}/` now require authentication (`@permission_classes([IsAuthenticated])`). Transcript text and voice ids no longer printed to stdout. ([#86](https://github.com/cloneisyou/clone/pull/86))
- DRF throttling: login (10/min anon), `/predict/` + `/predict/batch/` (60/min user), `/voice/*` (30/min user). ([#106](https://github.com/cloneisyou/clone/pull/106))
- nginx serves HSTS, CSP (`script-src 'self'`, `frame-ancestors 'none'`), X-Frame-Options DENY, X-Content-Type-Options nosniff, Referrer-Policy strict-origin-when-cross-origin; per-location `client_max_body_size` caps for `/api/recording/` (8m) and `/api/voice/` (16m). ([#105](https://github.com/cloneisyou/clone/pull/105))

**Calibration & decay**

- `batch_predict_view` now applies per-user Platt scaling, mirroring `predict_view`. Both endpoints return calibrated `confidence` plus `raw_confidence` (raw stays in `Prediction.confidence` for the daily fitter). ([#95](https://github.com/cloneisyou/clone/pull/95))
- `fit_for_user` caps to the 5000 most recent labeled rows + accepts a `since` cutoff; `fit_platt` honors per-sample weights. ([#99](https://github.com/cloneisyou/clone/pull/99))
- Behavioral decay is now atomic (`select_for_update` inside `transaction.atomic`) and **flip-aware** — flipping `accepted → rejected` reverses the prior `+0.04` and applies `-0.08`, instead of the old "second flip is a noop" semantic. ([#98](https://github.com/cloneisyou/clone/pull/98))

**Auth**

- `ApiKeyAuthentication` falls through to JWT when a stale `X-Clone-API-Key` is presented alongside a valid `Authorization: Bearer …`. Bad-key-only requests still 401 with the explicit message. ([#97](https://github.com/cloneisyou/clone/pull/97))
- Web: `AuthContext.fetchWithAuth` injects the access token, refreshes once on 401 via `/api/auth/token/refresh/`, retries the original request. Single in-flight refresh shared across concurrent 401s. ([#107](https://github.com/cloneisyou/clone/pull/107))
- CLI: provider API keys now persisted via OS keychain (`keyring` package) on macOS / Windows / Linux Secret Service. `~/.clone/secrets.json` (mode 0600) used when keychain is unreachable; existing entries auto-migrated to keychain on first re-login. ([#108](https://github.com/cloneisyou/clone/pull/108))

**Schema & recording**

- `EVENT_TYPE_CHOICES` extended with `input.keystroke`, `input.click`, `input.scroll` to mirror `packages/schema/events.schema.json` and the desktop pipeline. ([#96](https://github.com/cloneisyou/clone/pull/96))
- `RecordingSession` lookup is now `(id, user)`-scoped with explicit `IntegrityError` handling on cross-tenant collisions. ([#96](https://github.com/cloneisyou/clone/pull/96))
- `promote_memory` cron cohort = `RecordingEvent ∪ RawMemory` — users whose recent activity already lives in `RawMemory` (CLI push, backfill jobs) are no longer silently skipped. ([#90](https://github.com/cloneisyou/clone/pull/90))

**MCP surface**

- 10 tools, up from 3: `predict_next_prompt`, `submit_feedback`, `list_predictions`, `get_prediction`, `get_user_context`, `start_session`, `stop_session`, `record_agent_prompt`, `record_agent_response`, `record_event`. ([#83](https://github.com/cloneisyou/clone/pull/83), [#84](https://github.com/cloneisyou/clone/pull/84), [#85](https://github.com/cloneisyou/clone/pull/85))
- `eventSchema.source` constrained to the canonical enum; helpers accept an optional `source` override (default `'agent'`). ([#104](https://github.com/cloneisyou/clone/pull/104))
- HTTP transport init race closed: `createServer` runs first, `connect` is wrapped in try/catch with explicit transport cleanup on failure. ([#94](https://github.com/cloneisyou/clone/pull/94))

**Desktop**

- `captureFlags` (`screen` / `keyboard` / `mouse`) now plumbed through preload + IPC and gate the recording channels at the source. The Recording panel toggles are no longer UI theater. ([#93](https://github.com/cloneisyou/clone/pull/93))
- Session IDs include a CSPRNG suffix (`session_<ms>_<6-hex>`); two starts in the same millisecond produce distinct files. ([#101](https://github.com/cloneisyou/clone/pull/101))
- `EventLog` flushes to disk every 50 appends and on graceful close, with `fsync` on the underlying fd. SIGKILL / power-loss bound at-most-loss to ~1–2 seconds. ([#102](https://github.com/cloneisyou/clone/pull/102))

**CLI**

- Local daemon hard-clamped to loopback; `CLONE_DAEMON_HOST=0.0.0.0` is coerced to `127.0.0.1` with a stderr warning. ([#91](https://github.com/cloneisyou/clone/pull/91))
- Single-flight spawn lock (`fcntl.flock` on POSIX) closes the race where two concurrent invocations both Popen a daemon. ([#103](https://github.com/cloneisyou/clone/pull/103))
- Recording log shows the **real** `session_id` instead of a freshly minted UUID. ([#92](https://github.com/cloneisyou/clone/pull/92))

**Web**

- `Console.tsx` default tab is `Semantic Memories` in production builds; the legacy mock `Chat` tab is dev-only. ([#88](https://github.com/cloneisyou/clone/pull/88))
- `SourcesView` no longer fakes "live" counters via `Math.random` every 5s. ([#89](https://github.com/cloneisyou/clone/pull/89))
- `Signup` URL-encodes the invite token; `AccessControlView` token uses `crypto.randomUUID`; Console search aborts stale fetches via `AbortController` and shows transport errors. ([#100](https://github.com/cloneisyou/clone/pull/100))

## 2026-05-05

- **Documentation site bootstrap.** This site (`apps/docs`) is live at `https://clone.is/docs`. ([commit `bc210ac`](https://github.com/cloneisyou/clone/commit/bc210ac))
- **`api.clone.is` accepted on nginx server_name.** ([commit `bc210ac`](https://github.com/cloneisyou/clone/commit/bc210ac))
- **Static MCP server card** at `/.well-known/mcp/server-card.json` for Smithery. ([commit `68f0679`](https://github.com/cloneisyou/clone/commit/68f0679))

## 2026-04-01 → 2026-05-04

Detailed history in `git log --oneline main`. Highlights:

- Recording / Memory / Prediction layers split into their own Django apps.
- MCP server (`apps/mcp`) shipped with stdio + Streamable HTTP transports.
- Web marketing site rebuilt on Vite + React 19 + Tailwind v4.

---

## Contributing


We follow [trunk-based development](https://trunkbaseddevelopment.com/). `main` is the single source of truth and is always deployable; everything else is a short-lived branch with a PR back into `main` within 2–3 days.

## Branch naming

```
<type>/<app>-<short-description>
```

| Type | When |
|---|---|
| `feat` | New feature |
| `fix` | Bug fix |
| `chore` | Maintenance, config, tooling |
| `docs` | Documentation only |
| `refactor` | Code restructure without behavior change |

`<app>` refers to the target app or package: `web`, `server`, `android`, `ios`, `desktop`, `schema`, `docs`, `cli`, `mcp`. For monorepo-wide changes use the closest applicable scope (e.g. `chore/repo-...`).

Examples:

```
feat/web-landing-page
feat/server-prediction
fix/desktop-recording-crash
chore/schema-update
docs/site-bootstrap
```

## Commit messages

```
<type>: <short description>
```

Imperative mood. No trailing punctuation. Examples:

```
feat: add landing page
fix: resolve router redirect issue
chore: update event schema
docs: bootstrap Docusaurus site at /docs
```

Do **not** add `Co-Authored-By: Claude` or any other AI co-author trailer to commits in this repo.

## Pull requests

- One logical change per PR. If a refactor and a feature ride together, split them.
- Fill in the PR template (`.github/PULL_REQUEST_TEMPLATE.md`) — Overview, Proposed Changes, Testing, Note to Reviewers.
- PR body is **English only**.
- Always assign `cloneisme`.
- One approval required to merge.

## Releases

| Surface | How |
|---|---|
| Web / Server / Docs | Auto-deploy on merge to `main`. |
| Desktop | Code-signed and notarized release; cadence varies. |
| Mobile (`android`, `ios`) | Cut a `release/<platform>` branch from `main`. SemVer tags (`v1.0.0`). |

## Reviewing

- Verify the PR's `Testing` section matches the diff. If a surface changed, the matching command should appear in the test plan.
- For schema changes, confirm `packages/schema/events.ts` and `packages/schema/events.schema.json` were both updated and that the relevant `apps/server/recording/tests.py` cases still pass.
- For UI changes, ask for a screenshot or short video — the PR template's `Testing` section is the place for it.

---

## Local setup


This page walks through bringing up Clone end-to-end on one machine. Each surface uses the package manager already in its directory — `npm` for the JS surfaces, `uv` for the Python server.

## 1. Clone

```bash
git clone https://github.com/cloneisyou/clone.git
cd clone
cp .env.example .env
# Fill in ANTHROPIC_API_KEY, ELEVENLABS_API_KEY, etc.
```

## 2. Server

```bash
cd apps/server
uv sync
uv run python manage.py migrate
uv run python manage.py createsuperuser     # optional
uv run python manage.py runserver 0.0.0.0:8001
```

The dev server uses SQLite by default (`db.sqlite3`). Production uses Postgres via `DATABASE_URL`.

## 3. Web

```bash
cd apps/web
npm install
npm run dev    # http://localhost:5173/
```

Vite proxies `/api` to `localhost:8001` (see `apps/web/vite.config.ts`); do not change that target without coordinating with the server side.

## 4. MCP

```bash
cd apps/mcp
npm install

# Either issue an API key from the running server, or grab a JWT from POST /api/auth/login/
export CLONE_API_URL=http://localhost:8001
export CLONE_API_TOKEN=<dev key or JWT>

npm run dev      # stdio (matches Claude Code launch)
# OR
MCP_TRANSPORT=http PORT=3000 npm run dev   # http (matches production)
```

To hook the local MCP into Claude Code:

```bash
claude mcp add clone-dev -- node $(pwd)/apps/mcp/dist/index.js \
  -e CLONE_API_URL=http://localhost:8001 \
  -e CLONE_API_TOKEN=<dev key or JWT>
```

Run `npm run build` after editing MCP source so the dist bundle is fresh.

## 5. Desktop

```bash
cd apps/desktop
git submodule update --init --recursive
uv venv --python 3.13
uv pip install -e ".[windows]"   # or .[macos] depending on platform
uv run python -m desktop          # tray loads
uv run python -m pytest tests/    # unit + 5 MB byte-equal mock-server loopback
```

## 6. Docs (this site)

```bash
cd apps/docs
npm install
npm run dev      # http://localhost:3000/docs/
npm run build    # writes to apps/docs/build/
```

## End-to-end smoke test

With the server running on `localhost:8001` and an API key exported as `CLONE_API_TOKEN`:

```bash
# Record an event.
curl -sS -X POST http://localhost:8001/api/recording/events/ \
  -H "X-Clone-API-Key: $CLONE_API_TOKEN" -H "Content-Type: application/json" \
  -d '[{"id":"evt-smoke","session_id":"smoke","occurred_at":"2026-05-05T12:00:00Z","source":"agent","source_detail":"claude-code","type":"agent.prompt","agent":"Claude Code","prompt":"hello"}]'

# Predict.
curl -sS -X POST http://localhost:8001/api/predictions/predict/ \
  -H "X-Clone-API-Key: $CLONE_API_TOKEN" -H "Content-Type: application/json" \
  -d '{"agent":"Claude Code","agent_input":"hello","k":3,"threshold":0.8}'
```

Both calls should return `200`/`201` with sensible JSON.

---

## Development


This section covers everything a contributor needs to set up Clone locally and ship changes.

## In this section

- [Local setup](./local-setup) — clone, install, run each app, end-to-end smoke test.
- [Testing](./testing) — what to run before opening a PR. Each surface owns its own suite.
- [Contributing](./contributing) — branch naming, commit style, PR template, and the trunk-based workflow.

## Repo conventions at a glance

- **Trunk-based.** `main` is always deployable. Branches live for at most 2–3 days; large features are split into multiple PRs.
- **Branch format.** `<type>/<app>-<short-description>` (e.g. `feat/server-prediction`, `docs/site-bootstrap`). `type` is one of `feat`, `fix`, `chore`, `docs`, `refactor`.
- **Commit format.** `<type>: <description>`. See `CONTRIBUTING.md`.
- **PR template** lives at `.github/PULL_REQUEST_TEMPLATE.md`. PRs are English-only and always assigned to `cloneisme`.
- **AGENTS.md ecosystem.** Each major directory has its own `AGENTS.md` describing what the directory is for, how to work in it, and how to test it. Treat them as the first place to look when entering a new part of the tree.

## Where logic lives

- **Cross-stack contracts** → `packages/schema`. Both the Python validator and the TypeScript types live here.
- **Business logic** → `apps/server`. Every threshold, prompt, and memory shape is owned by one of the Django apps.
- **Client code** → `apps/{web,desktop,mcp,cli}`. These surfaces hold no business state; they are thin views over the server.

---

## Testing


There is no monorepo test runner. Each surface owns its own suite, and `CONTRIBUTING.md` lists exactly what must pass before merge.

## Required before merge

| Surface | Command | Notes |
|---|---|---|
| `apps/server` | `uv run python manage.py test` | Django + DRF unit tests across all apps. Must pass on the PR branch. |
| `apps/web` | `npm run lint && npm run build` | ESLint + `tsc -b` + `vite build`. |
| `apps/desktop` | `uv run python -m pytest tests/` | Python + ocap stack. Includes a 5 MB byte-equal loopback through a FastAPI mock server. |
| `apps/mcp` | `npm test` | 16 unit tests across api / server / http transports. |
| `apps/cli` | `npm run build` | TypeScript build only for now. |
| `apps/docs` | `npm run build` | Docusaurus build. Strict mode (`onBrokenLinks: 'throw'`) — broken internal links fail the build. |
| `packages/schema` | _no tests_ | Validation lives in the consumers (`apps/server/recording/tests.py`, TypeScript compile in clients). |

Cross-surface schema changes touch `packages/schema/` **and** every consumer. After editing `events.ts` / `events.schema.json`, verify:

- `apps/server` tests still pass (the schema is loaded at import time).
- TypeScript surfaces (`apps/web`, `apps/mcp`, `apps/cli`) still typecheck. `apps/desktop` is Python — its consumer of `packages/schema` is the server contract spec at `apps/desktop/docs/internals/upload.md`, not a typed import.

## End-to-end MCP smoke

`apps/mcp` ships two end-to-end smoke tests that exercise a real Anthropic call:

```bash
# stdio transport — spawns dist/index.js as a child process.
export CLONE_API_URL=http://localhost:8001
export CLONE_API_TOKEN=<dev key or JWT>
npm run build
npm run test:e2e

# Streamable HTTP transport — points at any deployed URL.
export CLONE_MCP_URL=https://clone.is/mcp
export CLONE_API_TOKEN=<prod key or JWT>
npm run test:e2e:http
```

These cost real Anthropic tokens, so run them locally or from a release-only CI lane — not on every PR.

## CI

GitHub Actions in `.github/workflows/` runs the per-surface checks above on every PR. Web and Server deploy automatically on merge to `main`.

## Adding a test

- **Server:** add a `test*.py` to the relevant app (`apps/server/<app>/tests.py`) and call it from `manage.py test`.
- **MCP:** add a file under `apps/mcp/tests/` and use the existing `node:test` harness.
- **Web / Desktop:** runtime tests don't exist yet; tighten the type system or add a Playwright pass when the change warrants it.

---

## FAQ & Troubleshooting


## "MCP fails with `network error talking to https://api.clone.is/...: fetch failed`"

The MCP server can connect to itself but not to the upstream Django API. Three usual causes, in order of likelihood:

1. **DNS NXDOMAIN cached on your local resolver.** Even after the registrar's A record is correct and `dig @8.8.8.8 api.clone.is` returns the right IP, your ISP's resolver may still have the negative cache. Wait the SOA TTL out (≤ 1 hour for Namecheap) or pin the host in `/etc/hosts`. See [DNS & TLS](./self-hosting/dns-tls).
2. **Wrong `CLONE_API_URL`.** `apps/mcp/src/index.ts` defaults to `https://api.clone.is`. Override via the env variable when running locally (`CLONE_API_URL=http://localhost:8001`).
3. **Server is down.** `curl -sS -X POST $CLONE_API_URL/api/recording/events/` should at least return a 401, not a connection error.

## "Recording helpers return `is not valid under any of the given schemas`"

The most common cause is a `source` value that isn't in the enum. `source` must be one of `desktop`, `cli`, `mobile`, `smartglass`, `agent`, `integration`. Free-form values like `"claude-code"` belong in `source_detail`, not `source`. The `record_agent_prompt` / `record_agent_response` / `start_session` / `stop_session` helpers all share the same enum.

The other gotchas:

- `record_agent_prompt` takes `prompt`, not `text`.
- `record_agent_response` takes `response`, not `text`.
- `integration` events **require** `source_detail` (the upstream provider name).

See [Schema](./schema) for the full validation rules.

## "Predictions return 504 Gateway Time-out under load"

`/api/predictions/predict/` performs a real Anthropic call (3–6 seconds is typical). Many concurrent calls queue behind nginx's `proxy_read_timeout` and individual requests can hit it. Mitigations:

- Issue calls serially, or in batches of ≤ 4 concurrent requests.
- Use the batch endpoint (`POST /api/predictions/predict/batch/`) — it shares one Memory bundle across up to 100 prompts.
- Tune nginx: increase `proxy_read_timeout` for `/api/predictions/` if your traffic pattern is consistently slow upstream.

## "All my predictions come back with `confidence` around 0.4 and `status: escalated`"

You're cold-start. The Memory layer hasn't seen enough of your real history yet. Two ways to seed it:

1. Push real events through `record_agent_prompt` / `record_agent_response` — actual question/answer pairs from how you really work.
2. Run promotion: `POST /api/memory/promote/episodes/` then `POST /api/memory/promote/facts/` (LLM-driven, costs Anthropic tokens).

Calibration improves once `POST /api/memory/context/` returns a non-empty bundle.

## "I edited the prediction. How do I tell Clone?"

`POST /api/predictions/<id>/feedback/` with `status: "edited"` and the actual reply. The Prediction layer uses these to compute precision in `/predictions/stats/` (`(auto + accepted) / answered`).

## "My `apps/web` build fails with `Cannot find module '@clone/design'`"

The Vite alias `@clone/design` resolves to `../../packages/design`. From a fresh clone, make sure the symlinks under `apps/web/public/` are intact (they point into `packages/design/assets/`). The `apps/web/Dockerfile` mirrors that layout in the build container; if you're building outside Docker, do `git config core.symlinks true` on Windows checkouts.

## "How do I reset my Memory layer for testing?"

There's no public "wipe all memory" endpoint by design — accidental resets are catastrophic. Either:

- Delete rows by ID via the per-tier `DELETE /api/memory/{raw,episodes,facts}/<uuid>/` endpoints.
- Or, in development, run `manage.py shell` and use the ORM directly.

## "Where do I file a bug?"

Open an issue or PR at [github.com/cloneisyou/clone](https://github.com/cloneisyou/clone). PR template lives at `.github/PULL_REQUEST_TEMPLATE.md`; issues are free-form.

---

## Clone — a persistent user model that predicts what you will type

import Link from '@docusaurus/Link';

# Clone

> A **persistent user model** that sits between humans and AI agents — predicting what the user would type so agents can talk to *Clone* instead of interrupting the human.

  <Link
    to="/quickstart"
    style={{display: 'inline-block', padding: '0.65rem 1.25rem', background: 'var(--ifm-color-primary)', color: '#0a0a0a', borderRadius: 8, fontWeight: 600, textDecoration: 'none'}}>
    Get Started →
  </Link>
  <Link
    to="https://github.com/cloneisyou/clone"
    style={{display: 'inline-block', padding: '0.65rem 1.25rem', border: '1px solid var(--ifm-color-emphasis-300)', borderRadius: 8, textDecoration: 'none'}}>
    View on GitHub
  </Link>
  <Link
    to="/mcp-reference/overview"
    style={{display: 'inline-block', padding: '0.65rem 1.25rem', border: '1px solid var(--ifm-color-emphasis-300)', borderRadius: 8, textDecoration: 'none'}}>
    Use as MCP server
  </Link>

## What is Clone?

Today every prompt requires the human in the loop. Clone breaks that loop: agents address a calibrated user model — your **Clone** — and the human is only paged when Clone isn't confident.

The system is three layers stacked under one account:

1. **Recording** — every event the user generates (computer-use frames, terminal turns, agent prompts and responses, integration webhooks) lands here as an idempotent `CloneEvent`.
2. **Memory** — the Recording stream is distilled into a four-tier store: a singleton `UserProfile`, atomic `SemanticMemory` facts, time-bounded `EpisodicMemory` summaries, and `RawMemory` rows. Promotion between tiers is LLM-driven and runs against the user's own data.
3. **Prediction** — given an agent's prompt, Clone assembles a context bundle from Memory, calls Anthropic with a prediction-shaped system prompt, and returns top-K candidate replies with calibrated `confidence`. The server marks the result `auto` if the top candidate clears the caller's `threshold`, otherwise `escalated`.

The endpoint shape is deliberately small. There is exactly one headline call — `POST /api/predictions/predict/` — and the rest of the surface is supporting CRUD. See [Architecture](./architecture) for the full picture.

## Quick Links

| | |
|---|---|
| 🚀 **[Quickstart](./quickstart)** | Get your first prediction in 5 minutes |
| 🏗️ **[Architecture](./architecture)** | The Recording → Memory → Prediction layer model |
| 🔌 **[MCP Server](./mcp-reference/overview)** | Drop Clone into Claude Code, Cursor, Claude Desktop, or any MCP-aware client |
| 📚 **[API Reference](./api-reference/overview)** | REST endpoints, auth, error codes |
| 🧬 **[Schema](./schema)** | `CloneEvent` JSON Schema and validation rules |
| 🛠️ **[Self-hosting](./self-hosting/overview)** | docker-compose, DNS, TLS, Smithery publish |
| 🧑‍💻 **[Development](./development/overview)** | Local setup, testing, contributing |
| ❓ **[FAQ & Troubleshooting](./faq)** | DNS NXDOMAIN, schema validation, prediction 504s, cold-start confidence |

## Key features

- **Calibrated, not bolted-on.** Predictions ship with a numeric `confidence` (post-Platt scaling) plus the original `raw_confidence` and an `auto`/`escalated` decision driven by the caller's `threshold`. The daily `fit_calibration` cron retrains the per-user sigmoid on actual `accepted` / `rejected` / `edited` outcomes; behavioral decay (atomic, flip-aware) updates fact importance on every feedback. Clients build automation (auto-respond when confident) or autocomplete (rank suggestions) on the same primitive.
- **One identity boundary, many surfaces.** A single Django service (`apps/server`) owns Recording, Memory, Prediction, Voice, Sources, and Access. Clients — `apps/web`, `apps/desktop`, `apps/mcp`, `apps/cli` — are thin views; no business state lives outside the server.
- **Idempotent ingest.** The Recording layer is keyed by per-event `id`. Re-posting an event with an existing id is a no-op, so producers retry freely.
- **One schema, two languages.** `packages/schema/events.ts` (TypeScript) and `packages/schema/events.schema.json` (Python validator) describe the same `CloneEvent` shape and move together.
- **MCP-native.** The MCP server (`apps/mcp`) exposes 7 tools — `predict_next_prompt`, `predict_continuation`, `submit_feedback`, `start_session`, `stop_session`, `record_agent_prompt`, `record_agent_response` — over both stdio (single-tenant local installs) and Streamable HTTP (multi-tenant production at `clone.is/mcp`).
- **Self-hostable end-to-end.** One `docker-compose.yml` brings up Postgres, the Django API, the MCP server, and the nginx-fronted web container that also serves these docs at `/docs/`.

## Trying Clone in 30 seconds

```bash
# Issue an API key from the Clone dashboard, then:
export CLONE_API_TOKEN="clone_xxxxxxxxxxxxxxxxxxxxxxxxxxxx"
export CLONE_API_URL="https://api.clone.is"

curl -sS -X POST "$CLONE_API_URL/api/predictions/predict/" \
  -H "X-Clone-API-Key: $CLONE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"agent":"Claude Code","agent_input":"Test finished. What next?","k":3,"threshold":0.8}'
```

The response carries `predicted_response`, ranked `candidates`, calibrated `confidence`, and `status` (`auto` or `escalated`). Cold-start predictions land in the 0.3–0.5 band; confidence rises as Memory accumulates real history.

The full walkthrough — including how to seed Recording data, install the MCP server in Claude Code, and call the API from any language — lives in the [Quickstart](./quickstart).

## Where Clone lives

| Surface | Path | Role |
|---|---|---|
| Web — marketing + dashboard | [`apps/web`](./apps/web) | Edit Memory, view analytics, manage API keys |
| Server — Django + DRF + Anthropic | [`apps/server`](./apps/server) | Recording, Memory, Prediction, Voice, Access |
| MCP — `@modelcontextprotocol/sdk` facade | [`apps/mcp`](./apps/mcp) | Predict-next-prompt for any MCP-aware client |
| Desktop — Python + ocap recorder | [`apps/desktop`](./apps/desktop) | Capture events, stream MCAP/MKV append-slabs to the server |
| CLI — Python + `prompt_toolkit` | [`apps/cli`](./apps/cli) | Terminal client and recording producer |

Cross-stack contracts live in [`packages/schema`](./packages/schema) (the SSOT) and brand assets in [`packages/design`](./packages/design).

## For LLMs and coding agents

Machine-readable entry points to this documentation, generated fresh on every deploy:

- **[`/docs/llms.txt`](pathname:///docs/llms.txt)** — curated index of every page with short descriptions. Safe to load wholesale into an LLM context.
- **[`/docs/llms-full.txt`](pathname:///docs/llms-full.txt)** — every page concatenated into a single markdown stream for one-shot ingestion.

If your agent is calling Clone over MCP rather than over REST, see [MCP Server reference](./mcp-reference/overview) — it's the canonical guide for `predict_next_prompt`, `submit_feedback`, and the session / recording helpers.

---

## MCP Server reference


This is the tool-by-tool reference for the Clone MCP server. The server is a thin wrapper over the REST API documented under [API Reference](../api-reference/overview); for the install / transport story see [Apps → MCP](../apps/mcp).

## Tool index

| Tool | Auth | Purpose |
|---|---|---|
| [`predict_next_prompt`](./predict-next-prompt) | bearer | Predict the human's reply to an agent prompt; returns top-K candidates with calibrated confidence and an `auto`/`escalated` decision. |
| `predict_continuation` | bearer | Personalized loop-termination decision for ralph-style self-correcting agents — returns `should_continue: bool` + calibrated confidence so the plugin can stop when the user would already be satisfied. |
| `submit_feedback` | bearer | Close the prediction loop — report `accepted` / `edited` / `rejected` so Platt calibration and fact decay learn from the outcome. |
| `start_session` / `stop_session` | bearer | Open / close a recording session; `start_session` returns a fresh `session_id` (emits `session.started` / `session.stopped`). |
| `record_agent_prompt` / `record_agent_response` | bearer | Push `agent.prompt` / `agent.response` events so the User Model has the conversational substrate for future predictions. |

## Calling pattern

Every tool is a thin pass-through to a Django endpoint:

| MCP tool | Server endpoint |
|---|---|
| `predict_next_prompt` | `POST /api/predictions/predict/` |
| `predict_continuation` | `POST /api/predictions/continuation/` |
| `submit_feedback` | `POST /api/predictions/<id>/feedback/` |
| `start_session` / `stop_session` / `record_agent_prompt` / `record_agent_response` | `POST /api/recording/events/` |

Errors propagate as MCP tool errors with the upstream HTTP status visible in the message — for example, a `503` from the prediction LLM-key path arrives at the client as `network error talking to https://api.clone.is/api/predictions/predict/: …`.

## Auth modes (recap)

- **stdio**: single-tenant. The server reads `CLONE_API_TOKEN` once at start-up; every tool call uses that token.
- **http**: multi-tenant. The server reads the per-request `Authorization: Bearer <jwt>` (or falls back to `CLONE_API_TOKEN` if set) and forwards it upstream so one MCP instance can serve many users.

Token format detection is automatic — tokens that start with `clone_` are sent as `X-Clone-API-Key`; anything else is treated as a JWT and sent as `Authorization: Bearer …`.

---

## predict_next_prompt

# `predict_next_prompt`

Predict what the human would type in response to an AI agent's prompt, using their Clone (personalized user model). Returns top-K ranked candidates with calibrated probabilities. Clients should pick the top candidate when its confidence ≥ threshold (auto-respond) and otherwise escalate to the human (autocomplete suggestion).

## Input

| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
| `agent` | string | yes | — | Name of the AI agent asking the human a question (`"Claude Code"`, `"Codex"`, `"Cursor"`). |
| `agent_input` | string | yes | — | The exact prompt the agent is sending to the human. |
| `k` | integer (1–10) | no | `1` | Number of candidate replies to generate. |
| `threshold` | number (0–1) | no | `0.8` | Confidence threshold for `auto` vs `escalated` status. |
| `session_id` | string | no | — | Optional grouping for later prediction history queries. |

## Output

```json
{
  "id": "<uuid>",
  "predicted_response": "<top-1 string>",
  "confidence": 0.42,
  "reasoning": "<why the top candidate was chosen>",
  "candidates": [
    { "response": "...", "confidence": 0.42, "reasoning": "..." },
    { "response": "...", "confidence": 0.28, "reasoning": "..." },
    { "response": "...", "confidence": 0.15, "reasoning": "..." }
  ],
  "k": 3,
  "status": "escalated",
  "threshold": 0.8,
  "model": "claude-sonnet-4-6",
  "latency_ms": 4827
}
```

`status` is server-decided: `"auto"` if `candidates[0].confidence ≥ threshold`, else `"escalated"`.

## Example

```bash
curl -sS -X POST https://api.clone.is/api/predictions/predict/ \
  -H "X-Clone-API-Key: $CLONE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "agent": "Claude Code",
    "agent_input": "Test finished. What next?",
    "k": 3,
    "threshold": 0.8
  }'
```

## What good calibration looks like

In a cold-start state (no Memory data accumulated for the user yet), expect `confidence` in the 0.3–0.5 band and `status: "escalated"`. As the Memory layer accumulates real history, top-1 confidence on familiar agent prompts moves toward 0.8+ and `status: "auto"` becomes common.

## Errors

| Upstream | What you'll see at the MCP client |
|---|---|
| `503` (LLM key missing or upstream unreachable) | `network error … 503` |
| `429` | `network error … 429` |
| `502` (Anthropic upstream non-200) | `network error … 502` |
| `400` (`agent` or `agent_input` missing) | `network error … 400` |

Underlying server logic lives in `apps/server/predictions/views.py:predict_view` and `apps/server/predictions/llm.py`.

---

## Memory layer


The memory layer is what turns the recording moat into agent-usable signal. It exists between the raw event stream (`apps/server/recording`) and the prediction surface (`apps/server/predictions`), and its job is twofold: keep the user's full signal recoverable, and keep the LLM's prompt small enough to be useful.

This page is the design contract.

## The four layers

```
L3  MODELS      compiled artifacts: profile, calibrated thresholds, soul card
                  daily recompile, single-source-of-truth lives in the server
                  ← cites top L2 facts

L2  FACTS       atomic semantic claims: preference / habit / skill / artifact
                  importance + confidence + tags
                  ← cites L1 episodes (why we believe this)

L1  EPISODES    coherent sessions extracted from raw signal
                  start/end timestamps, summary, source(s)
                  ← cites L0 raw rows

L0  RAW         append-only event stream
                  timestamps, source, payload, never rewritten
```

Every layer is markdown-shaped (currently rendered from Django ORM rows; we keep the markdown view as the user-facing source of truth). Provenance links go upward — a model cites the facts that compiled into it; a fact cites the episodes that produced it; an episode cites the raw rows it was derived from. Following any chain to its root yields a complete audit trail of every prediction.

## Why markdown is the substrate

Storage efficiency is not the constraint. Trust is. The user must be able to read, edit, and delete what their Clone knows about them, or every layer above breaks down. Markdown is diffable, versionable, exportable, and an LLM reads it natively. The compiled forms (embeddings, indices, the daily-recompiled soul card) are caches; the markdown is the ground truth from which any cache can be rebuilt.

A consequence: hierarchy is an emergent **view**, never the **storage primitive**. A flat namespace plus tags plus bidirectional links carries the structure. Real life is multi-categorical and strict directory hierarchies break inside a year. The directory tree is one of several ways to look at the corpus, not the corpus itself.

## Compaction is the system

The point of the four layers is that compaction is **reversible**. If a fact turns out wrong, deleting it triggers a re-compile from the episodes; if an episode summary was bad, deleting it triggers a re-summary from raw. L0 is never rewritten, so the system can always rewind to ground truth and re-derive. This is what gives the layer the property people usually call "no catastrophic forgetting" — the lower layers always survive.

Three compaction loops, three different cadences:

| | Source | Target | Cadence | Gating |
|---|---|---|---|---|
| **L0 → L1** | raw events for one closed session | episode summary | near-real-time, on `session.stopped` | none — auto |
| **L1 → L2** | accumulated episodes (last N days) | atomic facts | daily batch | none on default flow — auto with confidence tiers (see below) |
| **L2 → L3** | all facts ranked by importance | profile, soul card, calibrated thresholds | daily | none — auto |

## No hard approval gate (and why)

The earliest version of this design proposed a weekly user approval step on L1 → L2 promotion. That is wrong. A 7-day delay between observed behavior and prediction-grade signal breaks the "always-on, immediately personalized" promise of the product, and a weekly review ritual is friction, not stickiness. The first hour matters more than the seventh day.

Auto-promotion is the default flow. Trust is preserved by other mechanisms.

### Confidence-tiered auto-promotion

Each candidate fact lands in one of three states:

- **Strong** — pattern repeats 5+ times across 2+ sessions with no contradicting signal. Promote to L2 immediately, importance 0.9+, full weight in prediction.
- **Tentative** — 1–2 session pattern or weak signal. Promote to L2 with importance 0.4–0.6. Still feeds prediction, but at reduced weight; flagged in the user-facing view.
- **Candidate** — single occurrence. Held in a staging queue. If more evidence accumulates within 14 days, promote; otherwise discard.

The agent calling the prediction surface always sees the tier (via importance) and can route accordingly.

### Behavioral decay

A fact's confidence is not static. Whenever a prediction grounded on a fact is contradicted by the user's actual response, the cited fact takes a confidence hit. Whenever it is confirmed, it gains. Below a threshold, facts are auto-archived without user intervention. This is the silent garbage collector that makes auto-promotion safe — the system keeps grooming itself against ground truth.

### Read-and-correct, not review-and-approve

The user can open the markdown view at any time and delete or edit anything. One click to remove a wrong fact triggers a cascade re-compile. The user's role is **agency at the edges**, not **gatekeeper of the flow**. They are not required to act for the system to work; they retain full power to correct it when they choose to.

### Hard gates only at three edges

Three narrow cases keep their explicit gate:

1. **Sensitive categories** — health, relationships, finances, religion. Auto-promotion is disabled; candidates stay candidates until the user explicitly promotes them.
2. **Direct contradictions of high-importance facts.** A new candidate that contradicts an existing fact at importance ≥ 0.85 cannot auto-resolve. The user is prompted to pick which is true.
3. **Action-grounding facts.** Facts whose primary use is to authorize Clone-mediated action on the user's behalf (e.g. "the user is OK with auto-replying to internal emails") require explicit confirmation regardless of confidence. Prediction-only facts do not.

These three together cover well under 10% of fact volume; the other 90%+ flows automatically.

## Retrieval: stratified, not similarity-only

The prediction prompt has a finite token budget. A pure similarity-search retrieval ("nearest 10 facts to this prompt") fails the moment the user asks something off-pattern: every relevant fact gets crowded out by a single dominant theme, and the user's identity skeleton vanishes from the prompt. The architect-test for "does this system suffer from catastrophic forgetting?" is really asking "does retrieval silently drop the baseline?"

The answer is **stratified retrieval**:

- **Always-on baseline** — the L3 soul card, plus the top N L2 facts by importance, regardless of the current query. This is the user's identity skeleton.
- **Dynamic block** — the rest of the token budget filled by similarity-ranked L1 episodes and L0 raw rows within a recency window.

Token budget allocation goes baseline-first, dynamic-last. The baseline is not flexible; the dynamic block shrinks before the baseline does. This is what gives the layer its no-forgetting property in practice — the user's core is always present in every prediction, no matter what the user asks.

---

## @clone/design

# `@clone/design`

Brand assets shared across surfaces — logos, icons, social cards. The package is mostly static files; there is no code to import.

## How surfaces consume it

`apps/web/vite.config.ts` aliases `@clone/design` to `../../packages/design`. Public assets like favicons are exposed via symlinks under each surface's `public/` directory, e.g. `apps/web/public/*.{png,jpg,mp4,svg}` symlink into `packages/design/assets/`. `apps/desktop` is Python — its assets live under `apps/desktop/assets/{icon.png,CloneDesktop.ico,CloneDesktop.icns}` and are bundled by Nuitka, separate from the JS surfaces.

The `apps/web/Dockerfile` mirrors that layout in the build container — see the comment in the Dockerfile for the exact reasoning. New surfaces that need brand assets follow the same pattern: add a Vite alias, mirror it in any container build context.

## Updating an asset

1. Drop the new file under `packages/design/assets/`.
2. If a surface needs to reference it through `public/`, update the symlink there (do **not** copy).
3. Run `apps/web` build to confirm the bundler resolves the alias. `apps/desktop` consumes its own assets directory directly; if you replace the Clone Desktop icon set, regenerate `assets/CloneDesktop.{ico,icns}` and rerun the Nuitka build.

The docs site (this repository) does **not** consume `@clone/design` yet — the favicon and social card are still Docusaurus defaults.

---

## Packages


Two cross-surface packages live under `packages/`. They are intentionally tiny — most code in the monorepo belongs to one app, not to a shared package — and they exist only when something must be referenced from both client and server.

| Package | What it is |
|---|---|
| [`@clone/schema`](./schema) | Single source of truth for `CloneEvent` and other cross-stack wire shapes. TypeScript types and JSON Schema move together. |
| [`@clone/design`](./design) | Brand assets — logos, icons, social cards. Consumed via Vite alias by Desktop and Web. |

## When to add a new package

Don't, unless:

1. The same code must be referenced verbatim from **two or more** active surfaces (e.g. Web and Desktop), and
2. Inlining it in `apps/<surface>/` would force consumers to drift over time.

If only one surface owns the concept, keep it in that surface's tree. Cross-stack abstractions invented before they're needed are a known smell in this codebase — see `apps/AGENTS.md`.

---

## @clone/schema

# `@clone/schema`

Single source of truth for cross-stack Clone schemas. Every Clone surface — Desktop, CLI, Mobile, the Django API server, the MCP server — reads from this package. There is **no other authoritative place for an event shape**.

## Layout

```
packages/schema/
├── events.ts           # canonical TypeScript types for the Recording layer
├── events.schema.json  # JSON Schema mirroring events.ts (for non-TS validators)
├── index.ts            # re-exports
└── package.json        # @clone/schema
```

`events.ts` and `events.schema.json` describe the same wire shape and move together. The Django server (`apps/server/recording/views.py`) loads `events.schema.json` at import time and validates every ingested event against it; TypeScript clients import directly from `events.ts`.

## `CloneEvent`

`CloneEvent` is a discriminated union over `type`. Six event types are defined today:

| `type` | Extra fields |
|---|---|
| `session.started` | _(none)_ |
| `session.stopped` | _(none)_ |
| `app.focused` | `app: string`, `window_title: string \| null` |
| `capture.frame` | `uri: string`, `content_hash: string \| null`, `width: number`, `height: number` |
| `agent.prompt` | `agent: string`, `prompt: string` |
| `agent.response` | `agent: string`, `response: string`, `in_response_to: string \| null` |

Every variant inherits five base fields: `id`, `session_id`, `occurred_at` (ISO-8601), `source`, `type`, plus an optional `source_detail`.

## `source` enum

| Value | Modality |
|---|---|
| `desktop` | Computer-use trajectory on macOS / Windows / Linux |
| `cli` | Terminal sessions |
| `mobile` | Smartphone-use trajectory (iOS / Android) |
| `smartglass` | Smart-glass-use trajectory (Meta Ray-Ban etc.) |
| `agent` | Server-side agent events (Anthropic, OpenAI, …) |
| `integration` | Third-party services — `source_detail` names the provider, and is **required** for this source |

For `integration` events, `source_detail` is required and identifies the upstream provider — `'slack'`, `'notion'`, `'github'`, `'linear'`, `'drive'`, etc. New integrations add a new `source_detail` value, not a new `source`.

## Adding a new event type

1. Add the new interface in `packages/schema/events.ts` and append it to `CloneEvent` / `CLONE_EVENT_TYPES`.
2. Mirror the change in `packages/schema/events.schema.json` (new `oneOf` entry).
3. Add the matching DRF serializer in `apps/server/recording/serializers.py` and the type's payload fields to `RECORDING_PAYLOAD_FIELDS`.
4. Update the Django ingest test in `apps/server/recording/tests.py`.

For full validation rules and ready-to-paste JSON examples, see [Schema](../schema).

---

## Product thesis


Clone is the user model that agents call to act on a human's behalf with calibrated reliability. The MCP is where the product touches the agent ecosystem; the recording stack is the moat; the memory layer is the transformation; the human surfaces exist to consent, correct, and observe — not to be the product.

## Two customers, two contracts

The product has one user (the human being modeled) and two customers, with different demands:

| | **Integration customer** | **Purchase / consent customer** |
|---|---|---|
| Who | The agent calling Clone (Claude Code, Codex, future agents) | The human being modeled |
| What they need | Latency, calibration, schema stability, cost predictability, capability negotiation | Visibility, correction UI, trust receipts, sensitive-category control |
| Where they touch the product | `apps/mcp` and the prediction surface | `apps/web/you`, `apps/cli`, weekly digests, inline correction |

Both must be satisfied. The agent surface is the **hot path**; the human surface is the **legitimacy infrastructure**. Optimizing only the hot path kills consent, privacy, and ultimately churn. Optimizing only legitimacy ships a product no agent can integrate with. They are different surfaces with different SLOs and different success metrics; do not collapse them in roadmap discussions.

## What we're actually selling

The deliverable is **calibrated automation**, not raw automation. Two systems can both report 90% automation rate; the one that's wrong 1% of the time is strictly worse than no automation, because every wrong auto-response forces the human into review-and-correct mode and burns more attention than they would have spent answering the agent themselves.

The metric is therefore not "% of agent turns auto-handled" but **% correctly auto-handled, with the rest reliably escalated** — captured by the Personalized Automation Score (β = 0.3) which biases toward precision over coverage.

Three axioms follow:

1. A confidence number that an agent cannot route on is worthless. Calibration is a hard product requirement, not a polish.
2. Wrong auto-responses are strictly worse than escalations. The threshold should err high, never low.
3. The agent must be able to ask Clone *why* a prediction was made and inspect the receipts. Provenance is part of the contract.

## The moat is recording, not the MCP

Anyone can ship an MCP that exposes `predict_next_prompt`. The signature is mimicable in an afternoon. What is not mimicable is a multi-month, omni-source, ground-truthed corpus per user.

In product vocabulary:

- **MCP** is the **distribution surface** — how the product reaches the agent ecosystem.
- **Recording stack** (CLI, Desktop, Mobile, Smartglasses) is the **moat** — why no one catches up to the prediction quality even with the same MCP signature.
- **Memory layer** is the **transformation** — how raw moat is turned into agent-usable signal (see [Memory layer](./memory-layer.md)).

The temptation to call the recording surfaces "auxiliary" because the agent doesn't talk to them directly is misleading. Recording quality determines prediction quality determines agent retention. Treat capture engineering as first-class product engineering.

## Honest contract: bounded imitation, honest escalation

Clone does not "perfectly model" the user. Humans are not deterministic; promising perfect modeling promises something we cannot deliver and the agent doesn't actually want.

The honest contract is:

> Clone reliably approximates the user's response on **bounded tasks** within a **calibrated confidence band**, and **honestly escalates** every other case.

This is a stronger commercial pitch than "perfect." It is measurable, defensible, falsifiable, and exactly the contract the agent's routing logic needs to do its job. "I don't know — ask the human" is a feature, not a failure.

---

## Quickstart


This page takes you from "no Clone account" to "first prediction" in roughly five minutes. Pick one path:

- **[REST](#path-a-rest)** — `curl` the public API directly. Best for languages without an MCP client and for understanding the underlying surface.
- **[MCP](#path-b-mcp)** — drop Clone into Claude Code, Cursor, or Claude Desktop and let your agent call it. Best for everyday use.

Both paths assume you already have an account on `clone.is` (or a self-hosted instance — see [Self-hosting](./self-hosting/overview)).

---

## Path A — REST {#path-a-rest}

### 1. Issue an API key

API keys are long-lived bearer credentials starting with `clone_…`. Issue one from `POST /api/auth/keys/` (see [Authentication](./api-reference/authentication)) or from the **API Keys** page in the web dashboard. Treat it like a password — anyone holding the key can act as you.

```bash
export CLONE_API_TOKEN="clone_xxxxxxxxxxxxxxxxxxxxxxxxxxxx"
export CLONE_API_URL="https://api.clone.is"
```

### 2. Push a few events into Recording

Predictions get markedly better once Clone has seen how you actually reply, so seed the Recording layer first. The wire shape is `CloneEvent` (see [Schema](./schema)):

```bash
curl -sS -X POST "$CLONE_API_URL/api/recording/events/" \
  -H "X-Clone-API-Key: $CLONE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '[
    {
      "id": "evt-1",
      "session_id": "demo-session",
      "occurred_at": "2026-05-05T12:00:00Z",
      "source": "agent",
      "source_detail": "claude-code",
      "type": "agent.prompt",
      "agent": "Claude Code",
      "prompt": "Test finished. What next?"
    }
  ]'
```

Expected response: `{ "accepted": 1, "duplicates": 0, "invalid": [] }`. The endpoint is idempotent on `id`, so re-sending returns `duplicates: 1` rather than recording it twice.

:::tip Common ingest mistake
`source` is an **enum** — `desktop | cli | mobile | smartglass | agent | integration`. Free-form strings like `"claude-code"` go in `source_detail`, not `source`. See [FAQ](./faq) and [Schema](./schema).
:::

### 3. Ask Clone to predict

```bash
curl -sS -X POST "$CLONE_API_URL/api/predictions/predict/" \
  -H "X-Clone-API-Key: $CLONE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "agent": "Claude Code",
    "agent_input": "Test finished. What next?",
    "k": 3,
    "threshold": 0.8
  }'
```

The response includes `predicted_response`, calibrated `confidence`, the model's `reasoning`, and a ranked `candidates` list. `status` is server-decided:

- `auto` — top candidate's `confidence ≥ threshold`. Your client should act on the prediction without paging the human.
- `escalated` — confidence below threshold. Route to the human.

Cold-start predictions usually return `escalated` with mid-band confidence (~0.3–0.5); confidence rises as Memory accumulates real history.

### 4. Mark the actual reply (optional but recommended)

Once you know what the user actually typed, write it back so Clone can compute precision and improve calibration:

```bash
curl -sS -X POST "$CLONE_API_URL/api/predictions/<prediction-id>/feedback/" \
  -H "X-Clone-API-Key: $CLONE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"status":"edited","actual_response":"…what they actually typed…"}'
```

`status` is one of `accepted` (used Clone's prediction as-is), `edited` (typed something different — `actual_response` required), or `rejected` (discarded).

---

## Path B — MCP {#path-b-mcp}

### 1. Issue an API key

Same as [Path A step 1](#1-issue-an-api-key) — either via the dashboard or `POST /api/auth/keys/`.

### 2. Install the MCP server

From a Claude Code session:

```bash
claude mcp add clone -- npx -y @clone/mcp \
  -e CLONE_API_URL=https://api.clone.is \
  -e CLONE_API_TOKEN="$CLONE_API_TOKEN"
```

Cursor, Claude Desktop, and other MCP-aware clients use the same shape — point them at the `@clone/mcp` package and pass the two env vars. Tokens that start with `clone_` are sent as `X-Clone-API-Key` automatically; anything else is treated as a JWT and sent as `Authorization: Bearer …`.

If you'd rather use the multi-tenant hosted endpoint, install it via Smithery — see [Self-hosting → Smithery](./self-hosting/smithery).

### 3. Verify the install

In your agent, ask it to call the `predict_next_prompt` tool with any agent prompt. The tool returns the same shape as `POST /api/predictions/predict/` — see [`predict_next_prompt`](./mcp-reference/predict-next-prompt) for the full input/output reference.

---

## Where to go next

- **[Architecture](./architecture)** — the layered model behind these endpoints.
- **[MCP Server reference](./mcp-reference/overview)** — tool-by-tool detail with verified payloads.
- **[API Reference](./api-reference/overview)** — full REST surface, auth, error codes.
- **[FAQ](./faq)** — every gotcha new self-hosters trip over (DNS, schema enums, 504s, cold-start confidence).

---

## Schema


`packages/schema/events.schema.json` is the canonical wire-shape definition for events flowing into the Recording layer. The Django server loads it once at import time (`apps/server/recording/views.py`) and validates every ingested event against it. TypeScript clients import the matching types from `packages/schema/events.ts`. Both files describe the same shape and move together.

## `CloneEvent` base fields

Every event has the same five required base fields plus an optional `source_detail`:

| Field | Type | Notes |
|---|---|---|
| `id` | string (≥ 1 char) | Idempotency key. |
| `session_id` | string (≥ 1 char) | Owning session. Auto-creates the session on first event. |
| `occurred_at` | ISO-8601 datetime | Producer-side wall-clock time. |
| `source` | enum | `desktop \| cli \| mobile \| smartglass \| agent \| integration` |
| `source_detail` | string \| null | **Required** when `source = "integration"`; optional otherwise. |
| `type` | enum | One of the nine event types below. |

## Variants (the `oneOf`)

The schema is a `oneOf` over `type`. Each variant declares `additionalProperties: false`, so unknown fields are rejected.

### `session.started` / `session.stopped`

No extra fields. Bookend events for a `RecordingSession`. The first `session.started` re-anchors the session's `started_at`, `source`, and `source_detail`; a later `session.stopped` sets `ended_at`.

### `app.focused`

| Field | Type | Required |
|---|---|---|
| `app` | string (≥ 1 char) | yes |
| `window_title` | string \| null | yes (`null` is allowed) |

### `capture.frame`

| Field | Type | Required |
|---|---|---|
| `uri` | string (≥ 1 char) | yes |
| `content_hash` | string \| null | yes |
| `width` | integer (≥ 1) | yes |
| `height` | integer (≥ 1) | yes |

### `input.keystroke`

| Field | Type | Required |
|---|---|---|
| `event_type` | enum (`press` / `release`) | yes |
| `modifiers` | object (`shift` / `ctrl` / `alt` / `meta`, all booleans) | yes |

PII redaction is the producer's job — desktop today streams raw MCAP/MKV append-slabs (`apps/desktop/docs/internals/upload.md`) and the server contract carries a `keychar`/`keycode` strip step server-side; producers writing typed `CloneEvent` JSON (e.g. `apps/cli`) drop those fields before the wire shape leaves the device.

### `input.click`

| Field | Type | Required |
|---|---|---|
| `x` | integer | yes |
| `y` | integer | yes |
| `button` | integer (≥ 1) | yes |
| `modifiers` | object | yes |

### `input.scroll`

| Field | Type | Required |
|---|---|---|
| `x` | integer | yes |
| `y` | integer | yes |
| `delta_x` | integer | yes |
| `delta_y` | integer | yes |
| `modifiers` | object | yes |

### `agent.prompt`

| Field | Type | Required |
|---|---|---|
| `agent` | string (≥ 1 char) | yes |
| `prompt` | string | yes |

### `agent.response`

| Field | Type | Required |
|---|---|---|
| `agent` | string (≥ 1 char) | yes |
| `response` | string | yes |
| `in_response_to` | string \| null | yes (`null` allowed when no prior `agent.prompt` is known) |

## Validation behavior

- The server uses `jsonschema.Draft202012Validator`. Errors are sorted by JSON pointer path; the first three are surfaced per row.
- One bad row does **not** fail a batch. The endpoint returns `accepted`, `duplicates`, and `invalid: [{ index, errors }]` so producers can retry the failures alone.
- `source_detail` is conditionally required — the schema's top-level `allOf` enforces "if `source = "integration"`, then `source_detail` is a non-empty string."

## Examples

### Valid `agent.prompt`

```json
{
  "id": "evt-001",
  "session_id": "claude-code-2026-05-05",
  "occurred_at": "2026-05-05T12:34:56Z",
  "source": "agent",
  "source_detail": "claude-code",
  "type": "agent.prompt",
  "agent": "Claude Code",
  "prompt": "Test finished. What next?"
}
```

### Valid `app.focused`

```json
{
  "id": "evt-002",
  "session_id": "desktop-2026-05-05",
  "occurred_at": "2026-05-05T12:35:00Z",
  "source": "desktop",
  "type": "app.focused",
  "app": "Cursor",
  "window_title": "apps/server/predictions/views.py"
}
```

### Invalid — wrong `source` enum

```json
{
  "source": "claude-code",   // ← rejected; "claude-code" is not a source enum value.
  ...
}
```

The fix is `source: "agent"` with `source_detail: "claude-code"`. This is one of the most common ingest errors and was the root cause of the `invalid` response observed in the [Quickstart](./quickstart) before the schema rules click.

## Adding a new type

Steps live in [`@clone/schema`](./packages/schema).

---

## DNS & TLS


The web container's nginx terminates TLS for three hostnames out of the box: the apex, `www.`, and `api.`. All three must resolve to the host IP and all three must have valid certificates before traffic flows.

## DNS records

Assuming your host's public IP is `<HOST_IP>`:

| Record | Type | Host | Value |
|---|---|---|---|
| Apex | A | `@` | `<HOST_IP>` |
| Subdomain | A | `www` | `<HOST_IP>` |
| Subdomain | A | `api` | `<HOST_IP>` |

CNAMEs from `www` / `api` to the apex also work, but A records are the simplest and avoid CNAME-flattening edge cases.

After updating DNS, verify from outside your network:

```bash
dig @8.8.8.8 clone.is A +short
dig @8.8.8.8 www.clone.is A +short
dig @8.8.8.8 api.clone.is A +short
```

All three should print `<HOST_IP>`.

## TLS certificates

The web container mounts `/etc/letsencrypt` from the host read-only. The simplest way to issue certificates is on the host with `certbot` against the running nginx:

```bash
sudo apt install certbot
sudo certbot certonly --webroot -w /var/www/letsencrypt \
  -d clone.is -d www.clone.is -d api.clone.is
```

Or run certbot's standalone mode while web is briefly down. After issuance, the cert lives at `/etc/letsencrypt/live/<domain>/{fullchain,privkey}.pem`, and `apps/web/nginx.conf` references the path directly:

```nginx
ssl_certificate     /etc/letsencrypt/live/clone.is/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/clone.is/privkey.pem;
```

Renewal is automatic via certbot's systemd timer; nginx reloads on the next request after the cert rotates (or you can `docker compose exec web nginx -s reload`).

## The `api.clone.is` NXDOMAIN gotcha

A subdomain can be live in the registrar's nameservers and on Google/Cloudflare resolvers (`8.8.8.8`, `1.1.1.1`) while a specific ISP's resolver still serves a cached `NXDOMAIN` from the previous unconfigured state. The negative TTL is whatever the SOA records — for Namecheap-hosted DNS the default is roughly an hour.

Symptoms:

- The MCP server fails with `network error talking to https://api.clone.is/...: fetch failed`.
- `dig @8.8.8.8 api.clone.is` returns the right IP.
- `dig api.clone.is` (your default resolver) returns only an SOA / `NXDOMAIN`.

Fixes, in increasing order of reach:

1. **Wait the negative TTL out** (typically ≤ 1 hour).
2. **Switch the affected client to a public resolver** (`8.8.8.8` / `1.1.1.1`) — fastest unblock without touching DNS.
3. **Pin the host with `/etc/hosts` on the affected machine** — a one-line `<HOST_IP> api.<your-domain>` is a clean local override and harmless to remove later. Do **not** ship this in a Dockerfile.

## Updating server names

If you fork the repo and rename, edit:

- `apps/web/nginx.conf` — `server_name`, `ssl_certificate*` paths, and the `301` redirect server.
- `docker-compose.yml` — `ALLOWED_HOSTS` for the `api` service.
- `apps/docs/docusaurus.config.ts` — `url`, `editUrl`, footer hrefs.
- `apps/mcp/src/index.ts` — `DEFAULT_BASE_URL` (only if running stdio mode against the public API by default).

---

## Docker Compose


The repo's root `docker-compose.yml` is the single entry point for self-hosting Clone. Four services, one Postgres volume, and a handful of environment variables.

## Bring it up

```bash
cp .env.example .env
# Edit .env: POSTGRES_PASSWORD, ANTHROPIC_API_KEY, ELEVENLABS_API_KEY, etc.
docker compose -f docker-compose.local.yml up --build
```

`docker-compose.local.yml` builds every service from source on each `up` and uses HTTP / non-privileged ports (`http://localhost:8080`). See the repo root README for details.

Postgres data persists in the named volume `pgdata` and survives container recreates.

## Service reference

```yaml
db:    postgres:16-alpine                  # internal port 5432
api:   build apps/server/Dockerfile        # internal port 8000
mcp:   build apps/mcp/Dockerfile           # internal port 3000
web:   build apps/web/Dockerfile           # exposed 80, 443
```

### `db`

Postgres 16. Healthchecked with `pg_isready -U clone`. Uses `POSTGRES_PASSWORD` from `.env`. Database name `clone`, user `clone`.

### `api`

Builds from the repo root so `packages/schema/events.schema.json` is reachable by `recording.views` (which loads it via `parents[3]`).

Key env vars:

| Var | Purpose |
|---|---|
| `DATABASE_URL` | `postgresql://clone:${POSTGRES_PASSWORD}@db:5432/clone` |
| `DEBUG` | `"False"` in production. |
| `ALLOWED_HOSTS` | `clone.is,www.clone.is,api.clone.is,api,localhost` — `api` and `localhost` are required for in-container service-to-service calls (e.g. `mcp → http://api:8000`) since Django's `DisallowedHost` check matches the `Host` header verbatim. |
| `ANTHROPIC_API_KEY` | Required for Prediction and memory promotion. |

### `mcp`

```yaml
environment:
  MCP_TRANSPORT: http
  PORT: 3000
  CLONE_API_URL: http://api:8000
  # CLONE_API_TOKEN intentionally unset — HTTP mode reads the bearer
  # off each request so one MCP instance serves many users.
depends_on:
  - api
```

### `web`

```yaml
ports:
  - "80:80"
  - "443:443"
volumes:
  - /etc/letsencrypt:/etc/letsencrypt:ro
depends_on:
  - api
  - mcp
```

The build context is the repo root because `apps/web/public/*` symlinks into `packages/design/assets/`, and the Vite alias `@clone/design` resolves to `../../packages/design`. The same image bundles the docs build (`apps/docs/build/` → `/usr/share/nginx/html/docs/`) so `/docs/` works without a separate hosting target.

## Migrations and admin

```bash
# One-shot Django commands run in the api service.
docker compose run --rm api python manage.py migrate
docker compose run --rm api python manage.py createsuperuser
```

## Persistence

- **`pgdata` named volume** — Postgres data. Back this up; nothing else in the stack is stateful on disk.
- **`/etc/letsencrypt` (host)** — TLS certs, mounted read-only into the web container.

## Health and logs

```bash
docker compose ps              # which services are up
docker compose logs -f api     # follow Django logs
docker compose logs -f web     # follow nginx + access logs
docker compose exec db psql -U clone   # open a psql session
```

## When something breaks

- **API responds with `400 DisallowedHost`** — `ALLOWED_HOSTS` doesn't include the `Host` header value the request arrived with. Add the host (often the internal one used by service-to-service calls).
- **MCP returns 502 from `web`** — the `mcp` service is not listening on 3000 yet (still booting), or `nginx.conf`'s `/mcp` block lost its `proxy_buffering off` setting (Streamable HTTP requires it).
- **Predictions return 503** — `ANTHROPIC_API_KEY` is missing, invalid, or rate-limited (the server collapses these into 503 / 429 — see `apps/server/predictions/views.py`).

DNS-specific gotchas — including the `api.clone.is` `NXDOMAIN` issue that bit the live deploy — live in [DNS & TLS](./dns-tls).

---

## Self-hosting


Clone is small enough to self-host. The whole stack runs from one `docker-compose.yml` at the repo root: Postgres, the Django API, the MCP server, and the nginx-fronted web container that also serves these docs and the public marketing site.

## What you'll need

- A host with Docker and docker-compose v2.
- Postgres 16 (containerized) or your own managed Postgres at `DATABASE_URL`.
- An Anthropic API key (`ANTHROPIC_API_KEY`) — Prediction and memory promotion require it.
- A domain name with HTTPS certificates (Let's Encrypt is the assumed default).

## Stack at a glance

| Service | Image / build context | Role |
|---|---|---|
| `db` | `postgres:16-alpine` | Primary database. |
| `api` | `apps/server` | Django + DRF backend. |
| `mcp` | `apps/mcp` | MCP server (HTTP transport). |
| `web` | `apps/web` (build context: repo root) | nginx — serves the SPA, reverse-proxies `/api`, `/admin`, `/static`, `/mcp`, and `/docs`. Also where the docs build is bundled. |

In production the only port exposed to the internet is the web container's `80/443`; everything else talks over the internal compose network.

## Pages in this section

- [Docker Compose](./docker-compose) — bring the stack up, environment variables, persistence, where each service lives.
- [DNS & TLS](./dns-tls) — required DNS records (including the `api.clone.is` subdomain), Let's Encrypt setup, and the most common DNS gotcha.
- [Smithery](./smithery) — publishing the MCP server URL through Smithery so end users can install it without editing config.

## Once it's up

- Public site is at `https://<your-domain>/`.
- Documentation is at `https://<your-domain>/docs/` (this site, deployed from your fork).
- API is at `https://<your-domain>/api/...` and at `https://api.<your-domain>/...`.
- MCP server is at `https://<your-domain>/mcp` (Streamable HTTP).

## Operational tips

- The Django container expects `ALLOWED_HOSTS` to include both your public domains and the internal hostnames `api`, `localhost` — see `docker-compose.yml` for the exact value used by `clone.is`.
- nginx mounts `/etc/letsencrypt` read-only from the host. Renew certificates on the host (e.g. with certbot's cron job); the container will pick them up at the next reload.
- The MCP server intentionally has `CLONE_API_TOKEN` unset in `http` mode so it can serve many users in parallel, each authenticated by their own bearer token.

---

## Smithery


[Smithery](https://smithery.ai) is a hosted gateway and registry for MCP servers. Once Clone's MCP HTTP endpoint is live at `https://<your-domain>/mcp`, you can publish the URL to Smithery and end users install it from any MCP-aware client (Claude Code, Cursor, Claude Desktop, etc.) without touching their local config.

## Publish

```bash
npx -y @smithery/cli mcp publish "https://<your-domain>/mcp" -n <namespace>/clone
```

Smithery then scans the endpoint, generates a server page, and exposes the install flow. Each end user provides their own `CLONE_API_TOKEN` in Smithery's config UI, and that token is forwarded as `Authorization: Bearer …` on every upstream request — so one Clone MCP instance serves many users without holding a shared key.

## Per-user token forwarding

The Clone MCP server in HTTP mode (`MCP_TRANSPORT=http`) is multi-tenant by design. `apps/mcp/src/http.ts` reads the bearer off each request's `Authorization` header (or the configured `CLONE_API_TOKEN` env as a fallback for single-tenant deployments) and forwards it upstream. nginx's `/mcp` location explicitly forwards both `Authorization` and `mcp-session-id`:

```nginx
proxy_set_header Authorization $http_authorization;
proxy_set_header mcp-session-id $http_mcp_session_id;
```

## Static MCP card

`apps/web/public/.well-known/mcp/server-card.json` is a static MCP server card that Smithery can scan without authentication, so the listing renders even when the upstream API is briefly down. Update it whenever you change the server's name, description, or tool list.

## Verifying the publish

```bash
curl -sS https://<your-domain>/mcp/health      # → 200 OK from the MCP server
curl -sS https://smithery.ai/api/mcp/<namespace>/clone  # check Smithery has indexed it
```

If Smithery's listing shows tools but install fails for end users, the most common cause is that `proxy_buffering` is on for `/mcp` — Streamable HTTP requires it disabled for the long-poll connection.

---

## Warm-start

# Warm-start (transcripts + GitHub)

Warm-start lets a brand-new Clone user import existing memory on day one from two places:

1. **Local agent transcripts** — JSONL files written by Claude Code, Codex CLI, Gemini CLI on the user's machine. The web reads them in-browser via a folder picker, parses them locally, and POSTs only the per-turn rows.
2. **GitHub repos** — fetched server-side via OAuth (preferred) or a one-shot PAT/username (fallback).

This page covers what you need to set up the GitHub side, since transcripts have zero server-side configuration.

## 1. GitHub OAuth App registration

You need **one OAuth App per environment** (e.g. local dev, production). They have different callback URLs and can't be shared.

**Per environment:**

1. Visit [github.com/settings/applications/new](https://github.com/settings/applications/new) while signed in as the account that should own the app (org-owned apps are fine — set the org in the dropdown).

2. Fill in:
   - **Application name** — e.g. `Clone`. End-users see this on the GitHub authorize page.
   - **Homepage URL** — the public web URL of your deployment (e.g. `http://localhost:5173` in dev, `https://<your-domain>` in production).
   - **Authorization callback URL** — `<WEB_HOST>/api/warmstart/github/oauth/callback/`. The trailing slash is required, and the host should be same-origin with the web app (nginx path-routes `/api/` to the Django container) so the redirect doesn't cross-origin. In dev with Vite proxying `/api`, point this directly at Django (e.g. `http://localhost:8001/api/warmstart/github/oauth/callback/`).

3. Click **Register application**. On the next page:
   - Copy the **Client ID** (visible immediately).
   - Click **Generate a new client secret** → copy it. **Only shown once.**

## 2. Server environment variables

Set the following on the API container in each environment:

```bash
GITHUB_OAUTH_CLIENT_ID=<client id from step 1>
GITHUB_OAUTH_CLIENT_SECRET=<client secret from step 1>
GITHUB_TOKEN_KEY=<32-byte base64 Fernet key — see below>
CLONE_WEB_URL=<public web URL>            # e.g. https://clone.is
```

The OAuth callback redirects the browser back to `${CLONE_WEB_URL}/console?warmstart=github-connected` on success, so this value must point at the user-facing web app, not the API.

### Generating `GITHUB_TOKEN_KEY`

It's a Fernet key — a 32-byte URL-safe base64 string ending in `=`. Generate one with:

```bash
python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"
```

**Treat it like a password.** Loss of the key → every stored OAuth token in the DB becomes undecryptable, and every user with a connection has to reconnect. Store it in your secret manager (AWS Secrets Manager, Vault, Doppler, etc.), not in `.env` files in the repo.

## 3. Token-key rotation

`GITHUB_TOKEN_KEY` accepts a comma-separated list of keys, oldest last. The first (newest) key is used for all new encrypts; subsequent keys are tried in order during decrypt. This means rotation is non-destructive — you can roll a fresh key without invalidating any stored tokens.

**Procedure:**

1. Generate a new Fernet key.
2. Update the env var to put the new key first:
   ```
   GITHUB_TOKEN_KEY=<new key>,<old key>
   ```
3. Restart the API container. New encrypts (any user reconnect, OAuth callback, etc.) now use the new key. Existing rows still decrypt using the old key.
4. Optionally, run a one-time backfill to re-encrypt every stored token with the new primary:
   ```python
   # Django shell
   from warmstart.services import rotate_user_token
   from django.contrib.auth import get_user_model
   for user in get_user_model().objects.iterator():
       rotate_user_token(user)
   ```
5. Once every row has been re-encrypted (verify by spot-checking that `GithubConnection.updated_at` is recent on every row), drop the old key from the env var:
   ```
   GITHUB_TOKEN_KEY=<new key>
   ```
6. Restart again.

If you skip step 4 and just remove the old key, any user whose token was last written with the old key will be unable to sync until they reconnect. The web surfaces this as a "Reconnect required" banner (see §5).

## 4. Reconnect lifecycle (when GitHub revokes a token)

If a user revokes the OAuth grant on github.com, or GitHub itself invalidates the token (scope changes, account suspension, etc.), the next sync returns 401. The server reacts by:

1. Catching the 401 in `services.run_github_ingest` (raises `GithubAuthError`).
2. Flipping `GithubConnection.needs_reauth = True` for that user.
3. The `/api/warmstart/sources/` endpoint surfaces this as `oauth.needs_reauth: true`.

The web shows a yellow "Reconnect required" banner in the GitHub Repos panel, with a Connect-with-GitHub button. A successful re-OAuth clears the flag (via `services.store_github_token`).

No manual action required — this is fully self-healing. Worth knowing about so you can interpret the banner if a user reports it.

## 5. Troubleshooting

| Symptom (in `/console?warmstart=github-error&msg=...` or runserver log) | Cause | Fix |
| --- | --- | --- |
| `exchange_failed: GitHub OAuth token exchange 404: Not Found` | Callback URL on the OAuth App registration doesn't match what the server sent. | Edit the OAuth App on github.com to match the exact callback URL (trailing slash matters). |
| `exchange_failed: GitHub OAuth response had no access_token: bad_verification_code` | Auth code was already consumed or expired (>10 min). | User clicks Connect again — GitHub issues a fresh code. |
| `exchange_failed: incorrect_client_credentials` | `GITHUB_OAUTH_CLIENT_SECRET` doesn't match the one on github.com. | Generate a new client secret on github.com, update env, restart. |
| `token_store_failed: GITHUB_TOKEN_KEY is not configured.` | Env var missing or empty. | Set it (see §2), restart. |
| `token_store_failed: Stored GitHub token cannot be decrypted with any key in GITHUB_TOKEN_KEY.` | The Fernet key was changed without keeping the old one in the rotation list. | Add the previous key back to `GITHUB_TOKEN_KEY` (oldest last), restart, re-run rotation procedure (§3). |
| User's panel shows "Reconnect required" banner | GitHub returned 401 to a sync — token revoked / scope changed. | User clicks Connect with GitHub; flag clears automatically. No operator action needed. |
| State validation failures (`state_invalid` / `state_expired`) | Browser took >10 minutes between Connect click and GitHub redirecting back, or the state cookie was tampered with. | User clicks Connect again. If persistent, check that `SECRET_KEY` (Django) is consistent across API replicas — state is signed with it. |
| User reports stale repos | Auto-sync runs once per panel-open if the last sync is >1h old. Otherwise they hit the manual Sync button. | Direct them to the Sync button at the top-right of the panel. |

## 6. Data model reference

Three tables manage warm-start state. None of them need direct manipulation in normal operation — every state transition flows through the API.

- `warmstart_warmstartsourcestate` — per-(user, source_id) toggle and `last_synced_at`. One row per source per user.
- `warmstart_githubconnection` — one-to-one with `User`. Stores the encrypted access token, granted scope, GitHub login, `needs_reauth` flag.
- `warmstart_warmstarttombstone` — per-item "don't re-ingest" markers. Keyed by `(user, source, source_detail, key)` where `key` is `<project>::<session>` for transcripts or `<owner>/<name>` for repos. Created on per-item ✕; cleared on family-level wipe or full-source disconnect.

Raw memory rows themselves live in `memories_rawmemory` (the bottom of the 4-layer hierarchy) — warm-start writes there directly through the `TranscriptRawMemory` and `GithubRawMemory` proxy models.