Predictions

/api/predictions/* is Clone's prediction surface. The headline call (POST /predict/) takes an agent prompt, assembles a Memory context bundle in-process, calls Anthropic with the prediction system prompt, and returns top-K candidates with calibrated confidence and an auto/escalated decision.

Endpoints

Method	Path	Purpose
POST	`/api/predictions/predict/`	Predict the user's reply for a single agent prompt.
POST	`/api/predictions/predict/batch/`	Predict for up to 100 prompts in one call (shares one Memory bundle).
POST	`/api/predictions/<uuid>/feedback/`	Record the actual reply + final status (`accepted` / `edited` / `rejected`).
GET	`/api/predictions/<uuid>/`	Fetch one prediction with full payload (candidates, usage, context snapshot).
GET	`/api/predictions/`	List recent predictions. Filter by `status`, `agent`, `session_id`, `since`, `until`. Paginate with `limit` (default 50, max 200) + `offset`.
GET	`/api/predictions/stats/`	Aggregate `automation_rate` + `precision`.

Predict — request

curl -sS -X POST https://api.clone.is/api/predictions/predict/ \
  -H "X-Clone-API-Key: $CLONE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "agent": "Claude Code",
    "agent_input": "Test finished. What next?",
    "k": 3,
    "threshold": 0.8,
    "recency_minutes": 60,
    "session_id": "demo-1"
  }'

Field	Type	Required	Default	Range
`agent`	string	yes	—	—
`agent_input`	string	yes	—	—
`k`	integer	no	`1`	1–`MAX_K`
`threshold`	number	no	`0.8`	0–1
`model`	string	no	`claude-sonnet-4-6`	One of `SUPPORTED_MODELS`
`recency_minutes`	integer	no	`60`	1–10080
`session_id`	string	no	—	—

Predict — response

{
  "id": "<uuid>",
  "predicted_response": "...",
  "confidence": 0.42,
  "reasoning": "...",
  "candidates": [ {"response": "...", "confidence": 0.42, "reasoning": "..."}, ... ],
  "k": 3,
  "status": "escalated",
  "threshold": 0.8,
  "model": "claude-sonnet-4-6",
  "latency_ms": 4827,
  "usage": { "input_tokens": 1234, "output_tokens": 56 }
}

status is server-decided from candidates[0].confidence ≥ threshold (auto) or otherwise (escalated). On auto, the server also sets acted_at so the prediction is treated as having been honored.

Feedback

curl -sS -X POST https://api.clone.is/api/predictions/<id>/feedback/ \
  -H "X-Clone-API-Key: $CLONE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"status":"edited","actual_response":"…what the user actually typed…"}'

status must be one of accepted, edited, rejected.

accepted: user used Clone's prediction as-is. actual_response defaults to the prediction.
edited: user typed something different. actual_response is required.
rejected: user discarded without replacement. actual_response optional.

Stats

{
  "total":            1234,
  "by_status":        { "pending": 0, "auto": 700, "escalated": 200, "accepted": 100, "edited": 50, "rejected": 184 },
  "precision":        0.84,        // (auto + accepted) / answered
  "automation_rate":  0.84         // answered / total
}

Optional ?since= and ?until= ISO-8601 query params restrict the window — useful for week-over-week dashboards. precision and automation_rate are null when their denominator is zero rather than dividing by zero.

Endpoints​

Predict — request​

Predict — response​

Feedback​

Stats​

Endpoints

Predict — request

Predict — response

Feedback

Stats