Predictions
/api/predictions/* is Clone's prediction surface. The headline call (POST /predict/) takes an agent prompt, assembles a Memory context bundle in-process, calls Anthropic with the prediction system prompt, and returns top-K candidates with calibrated confidence and an auto/escalated decision.
Endpoints
| Method | Path | Purpose |
|---|---|---|
| POST | /api/predictions/predict/ | Predict the user's reply for a single agent prompt. |
| POST | /api/predictions/predict/batch/ | Predict for up to 100 prompts in one call (shares one Memory bundle). |
| POST | /api/predictions/<uuid>/feedback/ | Record the actual reply + final status (accepted / edited / rejected). |
| GET | /api/predictions/<uuid>/ | Fetch one prediction with full payload (candidates, usage, context snapshot). |
| GET | /api/predictions/ | List recent predictions. Filter by status, agent, session_id, since, until. Paginate with limit (default 50, max 200) + offset. |
| GET | /api/predictions/stats/ | Aggregate automation_rate + precision. |
Predict — request
curl -sS -X POST https://api.clone.is/api/predictions/predict/ \
-H "X-Clone-API-Key: $CLONE_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"agent": "Claude Code",
"agent_input": "Test finished. What next?",
"k": 3,
"threshold": 0.8,
"recency_minutes": 60,
"session_id": "demo-1"
}'
| Field | Type | Required | Default | Range |
|---|---|---|---|---|
agent | string | yes | — | — |
agent_input | string | yes | — | — |
k | integer | no | 1 | 1–MAX_K |
threshold | number | no | 0.8 | 0–1 |
model | string | no | claude-sonnet-4-6 | One of SUPPORTED_MODELS |
recency_minutes | integer | no | 60 | 1–10080 |
session_id | string | no | — | — |
Predict — response
{
"id": "<uuid>",
"predicted_response": "...",
"confidence": 0.42,
"reasoning": "...",
"candidates": [ {"response": "...", "confidence": 0.42, "reasoning": "..."}, ... ],
"k": 3,
"status": "escalated",
"threshold": 0.8,
"model": "claude-sonnet-4-6",
"latency_ms": 4827,
"usage": { "input_tokens": 1234, "output_tokens": 56 }
}
status is server-decided from candidates[0].confidence ≥ threshold (auto) or otherwise (escalated). On auto, the server also sets acted_at so the prediction is treated as having been honored.
Feedback
curl -sS -X POST https://api.clone.is/api/predictions/<id>/feedback/ \
-H "X-Clone-API-Key: $CLONE_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"status":"edited","actual_response":"…what the user actually typed…"}'
status must be one of accepted, edited, rejected.
accepted: user used Clone's prediction as-is.actual_responsedefaults to the prediction.edited: user typed something different.actual_responseis required.rejected: user discarded without replacement.actual_responseoptional.
Stats
{
"total": 1234,
"by_status": { "pending": 0, "auto": 700, "escalated": 200, "accepted": 100, "edited": 50, "rejected": 184 },
"precision": 0.84, // (auto + accepted) / answered
"automation_rate": 0.84 // answered / total
}
Optional ?since= and ?until= ISO-8601 query params restrict the window — useful for week-over-week dashboards. precision and automation_rate are null when their denominator is zero rather than dividing by zero.