Sunstead trust scoring project
0

Configure Feed

Select the types of activity you want to include in your feed.

at main 228 lines 12 kB View raw View rendered
1# Content Tower (Tier 1: frozen embeddings + calibrated head) — build plan 2 3Self-contained build doc. You can clear the conversation and hand this to a fresh 4agent. It captures the plan **and** the facts discovered while exploring the live data, 5so nothing here depends on chat history. 6 7--- 8 9## 0. What this is 10 11The PRD fuses two independent signals through a **monotone gate** (not an average): 12 13- **Tower A — identity trust** (per-DID, sybil-resistant, **load-bearing**): EigenTrust 14 over the vouch graph. Already built (`eigentrust.py`). 15- **Tower B — content risk** (per-PR, **identity-blind**): how risky *this diff* is, 16 judged with no knowledge of the author. Today this is only Claude (`review.py`) at the 17 gate, plus an advisory slop-kNN. **This doc builds the learned Tower B.** 18 19Tier 1 = run each diff through the **already-wired embedding transformer** 20(Featherless Qwen3-Embedding) and train a small **calibrated head** on `clean_merge`, 21using only the diff. Transformer representation power, no fine-tuning, leakage-free by 22construction (the model never sees identity). 23 24Tier 2 (fine-tuning a code transformer) is **deferred** until there are ~10³–10⁴ labeled 25diffs — below that it loses to frozen-embeddings + a linear head. Not in scope here. 26 27### Non-negotiable constraints (PRD) — must hold for every phase 28- Content models judge **content, never identity**: no author handle/DID/history/aggregates 29 feed Tower B. Diff + PR-intrinsic stats (size, files, discussion length) only. 30- Structural signal (Tower A / EigenTrust) stays load-bearing and sybil-resistant. 31- The gate is **not an average**: content can only **penalize**, never lift an untrusted DID 32 into the fast lane. 33- Calibrated + explainable. Serve a learned model only if it **beats its baseline** on a 34 proper holdout (same rule the GNN already follows). 35- Serviceless: single DuckDB file, large artifacts under `DATA_ROOT`. 36 37--- 38 39## 1. Critical path 40 41``` 42Phase 0: fetch diffs (patchBlobs) ─┐ 43 ├─► Phase 2 embed ─► Phase 3 head ─► Phase 4 fuse ─► Phase 5 eval-gate 44Phase 1: merged labels ────────────┘ 45``` 46 47**Phase 0 + Phase 1 are prerequisites for ANY content model** (transformer or not) and for 48Claude review and slop-kNN — all three are dead without diffs. Start at Phase 0. 49 50--- 51 52## 2. Live data facts (as observed) 53 54- **Live DB**: `/Volumes/spectrofi-rec/tangled-data/duckdb/trust.duckdb` 55 (`DATA_ROOT=/Volumes/spectrofi-rec/tangled-data`). The repo-local `.data/…` is a stale dev DB — ignore it. 56- Backfill is rich but the derived/label layer is **stale** (it ran before some `derive()` 57 branches existed). Snapshot: 58 - events 83,991 · contributors 10,848 · **vouches 2,029 (+) / 37 (−)** · pulls 5,768 59 - `seeds` = **0**, `pull_status` = **0** (collection was not in the old backfill — not even archived), 60 `stars` = 0 (14,409 `feed.star` events archived but not re-derived), `diff_text` = **0** 61 (patchBlobs never fetched), **0 positive `clean_merge` labels**, no trained model. 62- **Read the live DB read-only with retry** (single-writer; a held lock blocks every open): 63 ```python 64 import duckdb, time 65 con=None 66 for _ in range(80): 67 try: con=duckdb.connect("/Volumes/spectrofi-rec/tangled-data/duckdb/trust.duckdb", read_only=True); break 68 except duckdb.IOException: time.sleep(0.3) 69 ``` 70 Pause `ingest`/`api`/`backfill` before writing, or writes crawl on the lock. 71 72### Record shapes you'll need (confirmed from the network) 73 74`sh.tangled.repo.pull` (the diff is a gzipped blob, NOT inline): 75```json 76{ "rounds": [ { "createdAt": "...", 77 "patchBlob": { "$type": "blob", "ref": { "$link": "<CID>" }, 78 "mimeType": "application/gzip", "size": 49502 } } ], 79 "source": { "branch": "..." }, 80 "target": { "branch": "...", "repo": "did:plc:…", "repoDid": "did:plc:…" } } 81``` 82- `pr_id` convention (set in `ingest.derive`): `f"{author_did}/{collection}/{rkey}"`, 83 e.g. `did:plc:X/sh.tangled.repo.pull/3mp…`. 84- The **latest round** (`rounds[-1]`) is the final proposed change — embed/review that. 85 86`sh.tangled.repo.pull.status` (authoritative outcome, public; sparse): 87```json 88{ "pull": "at://did:plc:X/sh.tangled.repo.pull/<rkey>", 89 "status": "sh.tangled.repo.pull.status.merged" } // .merged / .closed / .open 90``` 91- Status author may differ from the pull owner — parse `pr_id` from the `pull` field 92 (`uri[len("at://"):]`), never from the status record's own did/rkey. (`derive()` already does this.) 93 94Knot git clone URL (for the Phase-1 label backstop, git-on-knots): `https://{knot}/{owner_did}/{repo}` 95(https, no auth for public repos; `git ls-remote` returns `refs/heads/main`). 96 97### Existing code to build on 98- `src/trust/embed.py` — `index_diffs(con, limit=256)` already embeds every `pull_requests.diff_text` 99 into `diff_vectors(pr_id, label, embedding DOUBLE[])`, idempotent/resumable; `embed()` returns 100 `None` without `FEATHERLESS_API_KEY`; `slop_score()` cosine-kNN vs `clean_merge=0`. 101- `src/trust/backfill.py` — reuse `_pds(did)`, `_get(url)`, `_records(pds,did,coll)`, the 102 `ThreadPoolExecutor` fan-out pattern, `_archive_and_derive`. 103- `src/trust/db.py` — `pull_requests.diff_text`, `pull_status`, `diff_vectors`, `pr_labels` 104 view (`clean_merge`), `connection(read_only=…)`, `ensure_schema()`. 105- `src/trust/ingest.py` — `derive()` (pull / pull_status / star branches). 106- `src/trust/learned.py` — copy its shape: `FEATURE_COLS`, `_vec`, `train(split)`, 107 `LearnedScorer`, isotonic calibration, `_reliability`, `MODEL_PATH = MODEL_DIR/…`. 108- `src/trust/fusion.py` — `score_pr`, `decide`, `should_review`, `_features_for`. 109- `src/trust/config.py` — `CFG.embed` (Featherless), `CFG.review`, `MODEL_DIR`. 110 111--- 112 113## 3. Phases 114 115### Phase 0 — Fetch the diffs (new `src/trust/diffs.py`) 116The highest-leverage unblock: lights up the content head, Claude review, **and** slop-kNN. 117 118Steps: 1191. Select pulls needing a diff: `SELECT pr_id, author_did, record(from events) FROM pull_requests WHERE diff_text IS NULL`. 120 The CID lives in the archived `events.record` JSON (`rounds[-1].patchBlob.ref.$link`); join 121 `events` on `(did, collection, rkey)` or re-read it. 1222. For each: resolve `_pds(author_did)`, then 123 `GET {pds}/xrpc/com.atproto.sync.getBlob?did={author_did}&cid={cid}` → bytes. 1243. `gzip.decompress(bytes).decode("utf-8", "replace")` → unified-diff text. Cap stored length 125 (~50 KB; embeddings/Claude truncate anyway). `UPDATE pull_requests SET diff_text=? WHERE pr_id=?`. 1264. Parallelize like `backfill`: network fetch in a 12-thread pool, DB writes in chunks (single writer). 127 Skip missing/oversized blobs gracefully (never abort the run). 128 129Deliverable: `pull_requests.diff_text` populated for ~5,768 PRs (minutes of network). 130Self-check: a `demo()` that fetches one known blob and asserts it gunzips to text containing `diff`/`@@`. 131 132### Phase 1 — Merged labels (you need a positive class) 133- Targeted scrape of `sh.tangled.repo.pull.status` (already mapped in `COLLECTION_KINDS` and handled 134 in `derive()`): `python -m trust.backfill --collection sh.tangled.repo.pull.status` (capped first 135 with `--max-repos`). 136- **Measure positives** before building the head: 137 `SELECT clean_merge, count(*) FROM pr_labels GROUP BY 1`. 138- **Risk:** pull.status is sparse. If positives are only tens, the head is data-starved too. 139 Backstop = **git-on-knots `merged` detection** (clone default branch via the knot URL above, 140 check whether each pull's patch landed) for broad `merged` coverage + `reverted`/`re-patched`. 141 Only build the backstop if pull.status coverage proves insufficient. 142 143Deliverable: `pr_labels.clean_merge` with a real positive class (need ≥ a few hundred ideally; 144the trainer requires ≥4 rows spanning both classes as a hard floor). 145 146### Phase 2 — Embed the diffs (frozen transformer) 147- Set `FEATHERLESS_API_KEY`. Run `index_diffs` to caught-up (loop while it returns > 0): 148 ```python 149 from trust.db import connection, ensure_schema 150 from trust import embed 151 ensure_schema() 152 with connection(read_only=False) as con: 153 while embed.index_diffs(con, limit=256): pass 154 ``` 155- Optional GPU: self-host Qwen3-Embedding-4B (fits one GPU) to embed ~6k diffs locally for free 156 instead of the API. The head itself is CPU-trivial. 157 158Deliverable: `diff_vectors` filled for every PR with a diff. 159 160### Phase 3 — The calibrated head (new `src/trust/content.py`, Tower B) 161- `_xy(con)`: `X` = `diff_vectors.embedding` for PRs that have a non-NULL `clean_merge` 162 (join `pr_labels`); `y` = `clean_merge`. Optionally concat **PR-intrinsic** scalars 163 (`additions, deletions, files_touched, discussion_len`) and the slop-kNN similarity. 164 **Never** identity/author features. 165- **Model: L2-normalize the embedding → logistic regression (linear probe, L2-reg) → isotonic 166 or Platt calibration.** Linear probe is correct for frozen embeddings at low data; LightGBM on 167 raw 2560-dim embeddings overfits — keep it only as an alt. 168- Time-split train/val (order by `opened_at`). Save `content.pkl` under `MODEL_DIR`. 169- `ContentScorer.prob(pr_id) -> P(content safe)`; expose `content_risk = 1 - P`. 170- Self-check `demo()`: on held-out PRs, a known-bad diff scores higher risk than a clean one; 171 print the reliability curve. 172 173Deliverable: a calibrated content risk for **every** PR (cheap, no API), not just reviewed ones. 174 175### Phase 4 — Fuse into the gate (monotone, unchanged) 176- In `fusion.score_pr`: the head supplies `content_risk` for all PRs; Claude (`review_pr`, gated by 177 `should_review`) refines ambiguous/sensitive ones. Combine conservatively: 178 `content_risk = max(model_risk, claude_risk)` so content still only **penalizes**. 179- Win: every PR gets a content signal; today only the Claude-reviewed subset does. 180- Keep `decide()` and its thresholds; surface the head's risk in the explanation 181 (`build_reason`) like the other factors. 182 183### Phase 5 — Eval + beat-the-baseline gate 184- Calibration: reliability curve (reuse `learned._reliability`). Ranking: AUC / average precision. 185- **Serve only if it beats**: (a) majority-class, (b) Claude-alone risk where available, 186 (c) slop-kNN alone — on a **time-split AND a repo-holdout** (generalize to unseen repos). 187- Write a verdict (like `gnn` does); `fusion` consults it before using the head. 188 189--- 190 191## 4. Effort & runtime 192 193| Phase | Build | Runtime | 194|---|---|---| 195| 0 diffs (`diffs.py`) | ~1 hr | few min (network) | 196| 1 labels (scrape) | wired | ~10 min capped | 197| 2 embed (`index_diffs`) | done | few min (API) | 198| 3 head (`content.py`) | ~1 hr | seconds | 199| 4 fuse (`fusion.py`) | ~30 min | — | 200| 5 eval-gate | ~30 min | seconds | 201 202≈ half a day of build + minutes of runtime, given `FEATHERLESS_API_KEY` and enough Phase-1 positives. 203 204## 5. GPU guidance 205- **Tier 1 needs no GPU** — embedding runs on Featherless (remote); the head is CPU-trivial. 206- Use a GPU now only to **self-host Qwen3-Embedding-4B** for free bulk embedding of ~6k diffs 207 (skip API cost/limits). 208- Save the GPU for **Tier 2** (fine-tuning CodeBERT/StarEncoder) — deferred until ~10³–10⁴ 209 labeled diffs exist. 210 211## 6. Definition of done 212- `diffs.py` populates `diff_text`; `pr_labels` has a positive class; `diff_vectors` filled. 213- `content.py` trains a calibrated head, identity-blind, with a reliability curve. 214- It **beats** majority / Claude-alone / slop-kNN on a time + repo holdout, else it doesn't serve. 215- `fusion` consumes it monotonically (content only penalizes); explanation shows the content factor. 216- Smoke test added (mirror `tests/test_smoke.py` style: `importorskip` the embedding path; assert a 217 bad diff out-risks a clean one). 218 219## 7. Parallel unblock (not this tower, but the other gating item) 220Structural scoring is still blocked by **`seeds = 0`** + stale derives. Independent of Tower B: 2211. `--rederive` from archived `events` (no network) → repopulates `stars` (and any archived 222 collections) through the current `derive()`. 2232. Seed real maintainer DIDs — top vouch-receivers are the anchors: 224 `did:plc:onu3oqfahfubgbetlr4giknc` (141 in), `did:plc:wshs7t2adsemcrrd4snkeqli` (89), 225 `did:plc:qfpnj4og54vl56wngdriaxug` (56)… → `INSERT INTO seeds …`. 2263. `trust-train` once labels (Phase 1) exist. 227EigenTrust (Tower A) and the content head (Tower B) can be built in either order; the gate needs both. 228```