Skip to main content

Warm-start (transcripts + GitHub)

Warm-start lets a brand-new Clone user import existing memory on day one from two places:

  1. Local agent transcripts — JSONL files written by Claude Code, Codex CLI, Gemini CLI on the user's machine. The web reads them in-browser via a folder picker, parses them locally, and POSTs only the per-turn rows.
  2. GitHub repos — fetched server-side via OAuth (preferred) or a one-shot PAT/username (fallback).

This page covers what you need to set up the GitHub side, since transcripts have zero server-side configuration.

1. GitHub OAuth App registration

You need one OAuth App per environment (e.g. local dev, production). They have different callback URLs and can't be shared.

Per environment:

  1. Visit github.com/settings/applications/new while signed in as the account that should own the app (org-owned apps are fine — set the org in the dropdown).

  2. Fill in:

    • Application name — e.g. Clone. End-users see this on the GitHub authorize page.
    • Homepage URL — the public web URL of your deployment (e.g. http://localhost:5173 in dev, https://<your-domain> in production).
    • Authorization callback URL<WEB_HOST>/api/warmstart/github/oauth/callback/. The trailing slash is required, and the host should be same-origin with the web app (nginx path-routes /api/ to the Django container) so the redirect doesn't cross-origin. In dev with Vite proxying /api, point this directly at Django (e.g. http://localhost:8001/api/warmstart/github/oauth/callback/).
  3. Click Register application. On the next page:

    • Copy the Client ID (visible immediately).
    • Click Generate a new client secret → copy it. Only shown once.

2. Server environment variables

Set the following on the API container in each environment:

GITHUB_OAUTH_CLIENT_ID=<client id from step 1>
GITHUB_OAUTH_CLIENT_SECRET=<client secret from step 1>
GITHUB_TOKEN_KEY=<32-byte base64 Fernet key — see below>
CLONE_WEB_URL=<public web URL> # e.g. https://clone.is

The OAuth callback redirects the browser back to ${CLONE_WEB_URL}/console?warmstart=github-connected on success, so this value must point at the user-facing web app, not the API.

Generating GITHUB_TOKEN_KEY

It's a Fernet key — a 32-byte URL-safe base64 string ending in =. Generate one with:

python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"

Treat it like a password. Loss of the key → every stored OAuth token in the DB becomes undecryptable, and every user with a connection has to reconnect. Store it in your secret manager (AWS Secrets Manager, Vault, Doppler, etc.), not in .env files in the repo.

3. Token-key rotation

GITHUB_TOKEN_KEY accepts a comma-separated list of keys, oldest last. The first (newest) key is used for all new encrypts; subsequent keys are tried in order during decrypt. This means rotation is non-destructive — you can roll a fresh key without invalidating any stored tokens.

Procedure:

  1. Generate a new Fernet key.
  2. Update the env var to put the new key first:
    GITHUB_TOKEN_KEY=<new key>,<old key>
  3. Restart the API container. New encrypts (any user reconnect, OAuth callback, etc.) now use the new key. Existing rows still decrypt using the old key.
  4. Optionally, run a one-time backfill to re-encrypt every stored token with the new primary:
    # Django shell
    from warmstart.services import rotate_user_token
    from django.contrib.auth import get_user_model
    for user in get_user_model().objects.iterator():
    rotate_user_token(user)
  5. Once every row has been re-encrypted (verify by spot-checking that GithubConnection.updated_at is recent on every row), drop the old key from the env var:
    GITHUB_TOKEN_KEY=<new key>
  6. Restart again.

If you skip step 4 and just remove the old key, any user whose token was last written with the old key will be unable to sync until they reconnect. The web surfaces this as a "Reconnect required" banner (see §5).

4. Reconnect lifecycle (when GitHub revokes a token)

If a user revokes the OAuth grant on github.com, or GitHub itself invalidates the token (scope changes, account suspension, etc.), the next sync returns 401. The server reacts by:

  1. Catching the 401 in services.run_github_ingest (raises GithubAuthError).
  2. Flipping GithubConnection.needs_reauth = True for that user.
  3. The /api/warmstart/sources/ endpoint surfaces this as oauth.needs_reauth: true.

The web shows a yellow "Reconnect required" banner in the GitHub Repos panel, with a Connect-with-GitHub button. A successful re-OAuth clears the flag (via services.store_github_token).

No manual action required — this is fully self-healing. Worth knowing about so you can interpret the banner if a user reports it.

5. Troubleshooting

Symptom (in /console?warmstart=github-error&msg=... or runserver log)CauseFix
exchange_failed: GitHub OAuth token exchange 404: Not FoundCallback URL on the OAuth App registration doesn't match what the server sent.Edit the OAuth App on github.com to match the exact callback URL (trailing slash matters).
exchange_failed: GitHub OAuth response had no access_token: bad_verification_codeAuth code was already consumed or expired (>10 min).User clicks Connect again — GitHub issues a fresh code.
exchange_failed: incorrect_client_credentialsGITHUB_OAUTH_CLIENT_SECRET doesn't match the one on github.com.Generate a new client secret on github.com, update env, restart.
token_store_failed: GITHUB_TOKEN_KEY is not configured.Env var missing or empty.Set it (see §2), restart.
token_store_failed: Stored GitHub token cannot be decrypted with any key in GITHUB_TOKEN_KEY.The Fernet key was changed without keeping the old one in the rotation list.Add the previous key back to GITHUB_TOKEN_KEY (oldest last), restart, re-run rotation procedure (§3).
User's panel shows "Reconnect required" bannerGitHub returned 401 to a sync — token revoked / scope changed.User clicks Connect with GitHub; flag clears automatically. No operator action needed.
State validation failures (state_invalid / state_expired)Browser took >10 minutes between Connect click and GitHub redirecting back, or the state cookie was tampered with.User clicks Connect again. If persistent, check that SECRET_KEY (Django) is consistent across API replicas — state is signed with it.
User reports stale reposAuto-sync runs once per panel-open if the last sync is >1h old. Otherwise they hit the manual Sync button.Direct them to the Sync button at the top-right of the panel.

6. Data model reference

Three tables manage warm-start state. None of them need direct manipulation in normal operation — every state transition flows through the API.

  • warmstart_warmstartsourcestate — per-(user, source_id) toggle and last_synced_at. One row per source per user.
  • warmstart_githubconnection — one-to-one with User. Stores the encrypted access token, granted scope, GitHub login, needs_reauth flag.
  • warmstart_warmstarttombstone — per-item "don't re-ingest" markers. Keyed by (user, source, source_detail, key) where key is <project>::<session> for transcripts or <owner>/<name> for repos. Created on per-item ✕; cleared on family-level wipe or full-source disconnect.

Raw memory rows themselves live in memories_rawmemory (the bottom of the 4-layer hierarchy) — warm-start writes there directly through the TranscriptRawMemory and GithubRawMemory proxy models.