Skip to main content

Predictions

/api/predictions/* is Clone's prediction surface. The headline call (POST /predict/) takes an agent prompt, assembles a Memory context bundle in-process, calls Anthropic with the prediction system prompt, and returns top-K candidates with calibrated confidence and an auto/escalated decision.

Endpoints

MethodPathPurpose
POST/api/predictions/predict/Predict the user's reply for a single agent prompt.
POST/api/predictions/predict/batch/Predict for up to 100 prompts in one call (shares one Memory bundle).
POST/api/predictions/<uuid>/feedback/Record the actual reply + final status (accepted / edited / rejected).
GET/api/predictions/<uuid>/Fetch one prediction with full payload (candidates, usage, context snapshot).
GET/api/predictions/List recent predictions. Filter by status, agent, session_id, since, until. Paginate with limit (default 50, max 200) + offset.
GET/api/predictions/stats/Aggregate automation_rate + precision.

Predict — request

curl -sS -X POST https://api.clone.is/api/predictions/predict/ \
-H "X-Clone-API-Key: $CLONE_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"agent": "Claude Code",
"agent_input": "Test finished. What next?",
"k": 3,
"threshold": 0.8,
"recency_minutes": 60,
"session_id": "demo-1"
}'
FieldTypeRequiredDefaultRange
agentstringyes
agent_inputstringyes
kintegerno11–MAX_K
thresholdnumberno0.80–1
modelstringnoclaude-sonnet-4-6One of SUPPORTED_MODELS
recency_minutesintegerno601–10080
session_idstringno

Predict — response

{
"id": "<uuid>",
"predicted_response": "...",
"confidence": 0.42,
"reasoning": "...",
"candidates": [ {"response": "...", "confidence": 0.42, "reasoning": "..."}, ... ],
"k": 3,
"status": "escalated",
"threshold": 0.8,
"model": "claude-sonnet-4-6",
"latency_ms": 4827,
"usage": { "input_tokens": 1234, "output_tokens": 56 }
}

status is server-decided from candidates[0].confidence ≥ threshold (auto) or otherwise (escalated). On auto, the server also sets acted_at so the prediction is treated as having been honored.

Feedback

curl -sS -X POST https://api.clone.is/api/predictions/<id>/feedback/ \
-H "X-Clone-API-Key: $CLONE_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"status":"edited","actual_response":"…what the user actually typed…"}'

status must be one of accepted, edited, rejected.

  • accepted: user used Clone's prediction as-is. actual_response defaults to the prediction.
  • edited: user typed something different. actual_response is required.
  • rejected: user discarded without replacement. actual_response optional.

Stats

{
"total": 1234,
"by_status": { "pending": 0, "auto": 700, "escalated": 200, "accepted": 100, "edited": 50, "rejected": 184 },
"precision": 0.84, // (auto + accepted) / answered
"automation_rate": 0.84 // answered / total
}

Optional ?since= and ?until= ISO-8601 query params restrict the window — useful for week-over-week dashboards. precision and automation_rate are null when their denominator is zero rather than dividing by zero.