Warm-start (transcripts + GitHub)
Warm-start lets a brand-new Clone user import existing memory on day one from two places:
- Local agent transcripts — JSONL files written by Claude Code, Codex CLI, Gemini CLI on the user's machine. The web reads them in-browser via a folder picker, parses them locally, and POSTs only the per-turn rows.
- GitHub repos — fetched server-side via OAuth (preferred) or a one-shot PAT/username (fallback).
This page covers what you need to set up the GitHub side, since transcripts have zero server-side configuration.
1. GitHub OAuth App registration
You need one OAuth App per environment (e.g. local dev, production). They have different callback URLs and can't be shared.
Per environment:
-
Visit github.com/settings/applications/new while signed in as the account that should own the app (org-owned apps are fine — set the org in the dropdown).
-
Fill in:
- Application name — e.g.
Clone. End-users see this on the GitHub authorize page. - Homepage URL — the public web URL of your deployment (e.g.
http://localhost:5173in dev,https://<your-domain>in production). - Authorization callback URL —
<WEB_HOST>/api/warmstart/github/oauth/callback/. The trailing slash is required, and the host should be same-origin with the web app (nginx path-routes/api/to the Django container) so the redirect doesn't cross-origin. In dev with Vite proxying/api, point this directly at Django (e.g.http://localhost:8001/api/warmstart/github/oauth/callback/).
- Application name — e.g.
-
Click Register application. On the next page:
- Copy the Client ID (visible immediately).
- Click Generate a new client secret → copy it. Only shown once.
2. Server environment variables
Set the following on the API container in each environment:
GITHUB_OAUTH_CLIENT_ID=<client id from step 1>
GITHUB_OAUTH_CLIENT_SECRET=<client secret from step 1>
GITHUB_TOKEN_KEY=<32-byte base64 Fernet key — see below>
CLONE_WEB_URL=<public web URL> # e.g. https://clone.is
The OAuth callback redirects the browser back to ${CLONE_WEB_URL}/console?warmstart=github-connected on success, so this value must point at the user-facing web app, not the API.
Generating GITHUB_TOKEN_KEY
It's a Fernet key — a 32-byte URL-safe base64 string ending in =. Generate one with:
python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"
Treat it like a password. Loss of the key → every stored OAuth token in the DB becomes undecryptable, and every user with a connection has to reconnect. Store it in your secret manager (AWS Secrets Manager, Vault, Doppler, etc.), not in .env files in the repo.
3. Token-key rotation
GITHUB_TOKEN_KEY accepts a comma-separated list of keys, oldest last. The first (newest) key is used for all new encrypts; subsequent keys are tried in order during decrypt. This means rotation is non-destructive — you can roll a fresh key without invalidating any stored tokens.
Procedure:
- Generate a new Fernet key.
- Update the env var to put the new key first:
GITHUB_TOKEN_KEY=<new key>,<old key>
- Restart the API container. New encrypts (any user reconnect, OAuth callback, etc.) now use the new key. Existing rows still decrypt using the old key.
- Optionally, run a one-time backfill to re-encrypt every stored token with the new primary:
# Django shellfrom warmstart.services import rotate_user_tokenfrom django.contrib.auth import get_user_modelfor user in get_user_model().objects.iterator():rotate_user_token(user)
- Once every row has been re-encrypted (verify by spot-checking that
GithubConnection.updated_atis recent on every row), drop the old key from the env var:GITHUB_TOKEN_KEY=<new key> - Restart again.
If you skip step 4 and just remove the old key, any user whose token was last written with the old key will be unable to sync until they reconnect. The web surfaces this as a "Reconnect required" banner (see §5).
4. Reconnect lifecycle (when GitHub revokes a token)
If a user revokes the OAuth grant on github.com, or GitHub itself invalidates the token (scope changes, account suspension, etc.), the next sync returns 401. The server reacts by:
- Catching the 401 in
services.run_github_ingest(raisesGithubAuthError). - Flipping
GithubConnection.needs_reauth = Truefor that user. - The
/api/warmstart/sources/endpoint surfaces this asoauth.needs_reauth: true.
The web shows a yellow "Reconnect required" banner in the GitHub Repos panel, with a Connect-with-GitHub button. A successful re-OAuth clears the flag (via services.store_github_token).
No manual action required — this is fully self-healing. Worth knowing about so you can interpret the banner if a user reports it.
5. Troubleshooting
Symptom (in /console?warmstart=github-error&msg=... or runserver log) | Cause | Fix |
|---|---|---|
exchange_failed: GitHub OAuth token exchange 404: Not Found | Callback URL on the OAuth App registration doesn't match what the server sent. | Edit the OAuth App on github.com to match the exact callback URL (trailing slash matters). |
exchange_failed: GitHub OAuth response had no access_token: bad_verification_code | Auth code was already consumed or expired (>10 min). | User clicks Connect again — GitHub issues a fresh code. |
exchange_failed: incorrect_client_credentials | GITHUB_OAUTH_CLIENT_SECRET doesn't match the one on github.com. | Generate a new client secret on github.com, update env, restart. |
token_store_failed: GITHUB_TOKEN_KEY is not configured. | Env var missing or empty. | Set it (see §2), restart. |
token_store_failed: Stored GitHub token cannot be decrypted with any key in GITHUB_TOKEN_KEY. | The Fernet key was changed without keeping the old one in the rotation list. | Add the previous key back to GITHUB_TOKEN_KEY (oldest last), restart, re-run rotation procedure (§3). |
| User's panel shows "Reconnect required" banner | GitHub returned 401 to a sync — token revoked / scope changed. | User clicks Connect with GitHub; flag clears automatically. No operator action needed. |
State validation failures (state_invalid / state_expired) | Browser took >10 minutes between Connect click and GitHub redirecting back, or the state cookie was tampered with. | User clicks Connect again. If persistent, check that SECRET_KEY (Django) is consistent across API replicas — state is signed with it. |
| User reports stale repos | Auto-sync runs once per panel-open if the last sync is >1h old. Otherwise they hit the manual Sync button. | Direct them to the Sync button at the top-right of the panel. |
6. Data model reference
Three tables manage warm-start state. None of them need direct manipulation in normal operation — every state transition flows through the API.
warmstart_warmstartsourcestate— per-(user, source_id) toggle andlast_synced_at. One row per source per user.warmstart_githubconnection— one-to-one withUser. Stores the encrypted access token, granted scope, GitHub login,needs_reauthflag.warmstart_warmstarttombstone— per-item "don't re-ingest" markers. Keyed by(user, source, source_detail, key)wherekeyis<project>::<session>for transcripts or<owner>/<name>for repos. Created on per-item ✕; cleared on family-level wipe or full-source disconnect.
Raw memory rows themselves live in memories_rawmemory (the bottom of the 4-layer hierarchy) — warm-start writes there directly through the TranscriptRawMemory and GithubRawMemory proxy models.