Content Tower (Tier 1: frozen embeddings + calibrated head) — build plan#
Self-contained build doc. You can clear the conversation and hand this to a fresh agent. It captures the plan and the facts discovered while exploring the live data, so nothing here depends on chat history.
0. What this is#
The PRD fuses two independent signals through a monotone gate (not an average):
- Tower A — identity trust (per-DID, sybil-resistant, load-bearing): EigenTrust
over the vouch graph. Already built (
eigentrust.py). - Tower B — content risk (per-PR, identity-blind): how risky this diff is,
judged with no knowledge of the author. Today this is only Claude (
review.py) at the gate, plus an advisory slop-kNN. This doc builds the learned Tower B.
Tier 1 = run each diff through the already-wired embedding transformer
(Featherless Qwen3-Embedding) and train a small calibrated head on clean_merge,
using only the diff. Transformer representation power, no fine-tuning, leakage-free by
construction (the model never sees identity).
Tier 2 (fine-tuning a code transformer) is deferred until there are ~10³–10⁴ labeled diffs — below that it loses to frozen-embeddings + a linear head. Not in scope here.
Non-negotiable constraints (PRD) — must hold for every phase#
- Content models judge content, never identity: no author handle/DID/history/aggregates feed Tower B. Diff + PR-intrinsic stats (size, files, discussion length) only.
- Structural signal (Tower A / EigenTrust) stays load-bearing and sybil-resistant.
- The gate is not an average: content can only penalize, never lift an untrusted DID into the fast lane.
- Calibrated + explainable. Serve a learned model only if it beats its baseline on a proper holdout (same rule the GNN already follows).
- Serviceless: single DuckDB file, large artifacts under
DATA_ROOT.
1. Critical path#
Phase 0: fetch diffs (patchBlobs) ─┐
├─► Phase 2 embed ─► Phase 3 head ─► Phase 4 fuse ─► Phase 5 eval-gate
Phase 1: merged labels ────────────┘
Phase 0 + Phase 1 are prerequisites for ANY content model (transformer or not) and for Claude review and slop-kNN — all three are dead without diffs. Start at Phase 0.
2. Live data facts (as observed)#
- Live DB:
/Volumes/spectrofi-rec/tangled-data/duckdb/trust.duckdb(DATA_ROOT=/Volumes/spectrofi-rec/tangled-data). The repo-local.data/…is a stale dev DB — ignore it. - Backfill is rich but the derived/label layer is stale (it ran before some
derive()branches existed). Snapshot:- events 83,991 · contributors 10,848 · vouches 2,029 (+) / 37 (−) · pulls 5,768
seeds= 0,pull_status= 0 (collection was not in the old backfill — not even archived),stars= 0 (14,409feed.starevents archived but not re-derived),diff_text= 0 (patchBlobs never fetched), 0 positiveclean_mergelabels, no trained model.
- Read the live DB read-only with retry (single-writer; a held lock blocks every open):
Pauseimport duckdb, time con=None for _ in range(80): try: con=duckdb.connect("/Volumes/spectrofi-rec/tangled-data/duckdb/trust.duckdb", read_only=True); break except duckdb.IOException: time.sleep(0.3)ingest/api/backfillbefore writing, or writes crawl on the lock.
Record shapes you'll need (confirmed from the network)#
sh.tangled.repo.pull (the diff is a gzipped blob, NOT inline):
{ "rounds": [ { "createdAt": "...",
"patchBlob": { "$type": "blob", "ref": { "$link": "<CID>" },
"mimeType": "application/gzip", "size": 49502 } } ],
"source": { "branch": "..." },
"target": { "branch": "...", "repo": "did:plc:…", "repoDid": "did:plc:…" } }
pr_idconvention (set iningest.derive):f"{author_did}/{collection}/{rkey}", e.g.did:plc:X/sh.tangled.repo.pull/3mp….- The latest round (
rounds[-1]) is the final proposed change — embed/review that.
sh.tangled.repo.pull.status (authoritative outcome, public; sparse):
{ "pull": "at://did:plc:X/sh.tangled.repo.pull/<rkey>",
"status": "sh.tangled.repo.pull.status.merged" } // .merged / .closed / .open
- Status author may differ from the pull owner — parse
pr_idfrom thepullfield (uri[len("at://"):]), never from the status record's own did/rkey. (derive()already does this.)
Knot git clone URL (for the Phase-1 label backstop, git-on-knots): https://{knot}/{owner_did}/{repo}
(https, no auth for public repos; git ls-remote returns refs/heads/main).
Existing code to build on#
src/trust/embed.py—index_diffs(con, limit=256)already embeds everypull_requests.diff_textintodiff_vectors(pr_id, label, embedding DOUBLE[]), idempotent/resumable;embed()returnsNonewithoutFEATHERLESS_API_KEY;slop_score()cosine-kNN vsclean_merge=0.src/trust/backfill.py— reuse_pds(did),_get(url),_records(pds,did,coll), theThreadPoolExecutorfan-out pattern,_archive_and_derive.src/trust/db.py—pull_requests.diff_text,pull_status,diff_vectors,pr_labelsview (clean_merge),connection(read_only=…),ensure_schema().src/trust/ingest.py—derive()(pull / pull_status / star branches).src/trust/learned.py— copy its shape:FEATURE_COLS,_vec,train(split),LearnedScorer, isotonic calibration,_reliability,MODEL_PATH = MODEL_DIR/….src/trust/fusion.py—score_pr,decide,should_review,_features_for.src/trust/config.py—CFG.embed(Featherless),CFG.review,MODEL_DIR.
3. Phases#
Phase 0 — Fetch the diffs (new src/trust/diffs.py)#
The highest-leverage unblock: lights up the content head, Claude review, and slop-kNN.
Steps:
- Select pulls needing a diff:
SELECT pr_id, author_did, record(from events) FROM pull_requests WHERE diff_text IS NULL. The CID lives in the archivedevents.recordJSON (rounds[-1].patchBlob.ref.$link); joineventson(did, collection, rkey)or re-read it. - For each: resolve
_pds(author_did), thenGET {pds}/xrpc/com.atproto.sync.getBlob?did={author_did}&cid={cid}→ bytes. gzip.decompress(bytes).decode("utf-8", "replace")→ unified-diff text. Cap stored length (~50 KB; embeddings/Claude truncate anyway).UPDATE pull_requests SET diff_text=? WHERE pr_id=?.- Parallelize like
backfill: network fetch in a 12-thread pool, DB writes in chunks (single writer). Skip missing/oversized blobs gracefully (never abort the run).
Deliverable: pull_requests.diff_text populated for ~5,768 PRs (minutes of network).
Self-check: a demo() that fetches one known blob and asserts it gunzips to text containing diff/@@.
Phase 1 — Merged labels (you need a positive class)#
- Targeted scrape of
sh.tangled.repo.pull.status(already mapped inCOLLECTION_KINDSand handled inderive()):python -m trust.backfill --collection sh.tangled.repo.pull.status(capped first with--max-repos). - Measure positives before building the head:
SELECT clean_merge, count(*) FROM pr_labels GROUP BY 1. - Risk: pull.status is sparse. If positives are only tens, the head is data-starved too.
Backstop = git-on-knots
mergeddetection (clone default branch via the knot URL above, check whether each pull's patch landed) for broadmergedcoverage +reverted/re-patched. Only build the backstop if pull.status coverage proves insufficient.
Deliverable: pr_labels.clean_merge with a real positive class (need ≥ a few hundred ideally;
the trainer requires ≥4 rows spanning both classes as a hard floor).
Phase 2 — Embed the diffs (frozen transformer)#
- Set
FEATHERLESS_API_KEY. Runindex_diffsto caught-up (loop while it returns > 0):from trust.db import connection, ensure_schema from trust import embed ensure_schema() with connection(read_only=False) as con: while embed.index_diffs(con, limit=256): pass - Optional GPU: self-host Qwen3-Embedding-4B (fits one GPU) to embed ~6k diffs locally for free instead of the API. The head itself is CPU-trivial.
Deliverable: diff_vectors filled for every PR with a diff.
Phase 3 — The calibrated head (new src/trust/content.py, Tower B)#
_xy(con):X=diff_vectors.embeddingfor PRs that have a non-NULLclean_merge(joinpr_labels);y=clean_merge. Optionally concat PR-intrinsic scalars (additions, deletions, files_touched, discussion_len) and the slop-kNN similarity. Never identity/author features.- Model: L2-normalize the embedding → logistic regression (linear probe, L2-reg) → isotonic or Platt calibration. Linear probe is correct for frozen embeddings at low data; LightGBM on raw 2560-dim embeddings overfits — keep it only as an alt.
- Time-split train/val (order by
opened_at). Savecontent.pklunderMODEL_DIR. ContentScorer.prob(pr_id) -> P(content safe); exposecontent_risk = 1 - P.- Self-check
demo(): on held-out PRs, a known-bad diff scores higher risk than a clean one; print the reliability curve.
Deliverable: a calibrated content risk for every PR (cheap, no API), not just reviewed ones.
Phase 4 — Fuse into the gate (monotone, unchanged)#
- In
fusion.score_pr: the head suppliescontent_riskfor all PRs; Claude (review_pr, gated byshould_review) refines ambiguous/sensitive ones. Combine conservatively:content_risk = max(model_risk, claude_risk)so content still only penalizes. - Win: every PR gets a content signal; today only the Claude-reviewed subset does.
- Keep
decide()and its thresholds; surface the head's risk in the explanation (build_reason) like the other factors.
Phase 5 — Eval + beat-the-baseline gate#
- Calibration: reliability curve (reuse
learned._reliability). Ranking: AUC / average precision. - Serve only if it beats: (a) majority-class, (b) Claude-alone risk where available, (c) slop-kNN alone — on a time-split AND a repo-holdout (generalize to unseen repos).
- Write a verdict (like
gnndoes);fusionconsults it before using the head.
4. Effort & runtime#
| Phase | Build | Runtime |
|---|---|---|
0 diffs (diffs.py) |
~1 hr | few min (network) |
| 1 labels (scrape) | wired | ~10 min capped |
2 embed (index_diffs) |
done | few min (API) |
3 head (content.py) |
~1 hr | seconds |
4 fuse (fusion.py) |
~30 min | — |
| 5 eval-gate | ~30 min | seconds |
≈ half a day of build + minutes of runtime, given FEATHERLESS_API_KEY and enough Phase-1 positives.
5. GPU guidance#
- Tier 1 needs no GPU — embedding runs on Featherless (remote); the head is CPU-trivial.
- Use a GPU now only to self-host Qwen3-Embedding-4B for free bulk embedding of ~6k diffs (skip API cost/limits).
- Save the GPU for Tier 2 (fine-tuning CodeBERT/StarEncoder) — deferred until ~10³–10⁴ labeled diffs exist.
6. Definition of done#
diffs.pypopulatesdiff_text;pr_labelshas a positive class;diff_vectorsfilled.content.pytrains a calibrated head, identity-blind, with a reliability curve.- It beats majority / Claude-alone / slop-kNN on a time + repo holdout, else it doesn't serve.
fusionconsumes it monotonically (content only penalizes); explanation shows the content factor.- Smoke test added (mirror
tests/test_smoke.pystyle:importorskipthe embedding path; assert a bad diff out-risks a clean one).
7. Parallel unblock (not this tower, but the other gating item)#
Structural scoring is still blocked by seeds = 0 + stale derives. Independent of Tower B:
--rederivefrom archivedevents(no network) → repopulatesstars(and any archived collections) through the currentderive().- Seed real maintainer DIDs — top vouch-receivers are the anchors:
did:plc:onu3oqfahfubgbetlr4giknc(141 in),did:plc:wshs7t2adsemcrrd4snkeqli(89),did:plc:qfpnj4og54vl56wngdriaxug(56)… →INSERT INTO seeds …. trust-trainonce labels (Phase 1) exist. EigenTrust (Tower A) and the content head (Tower B) can be built in either order; the gate needs both.