Sunstead trust scoring project
1# Tangled contributor trust scoring (EigenTrust + Claude)
2
3Calibrated, explainable, sybil-resistant trust scores that auto-triage Tangled PRs
4into **fast-lane / normal-queue / needs-human**. Two independent signals fused by a
5gate (not an average): **structural trust** (EigenTrust over the vouch graph) and
6**content review** (Claude reading the diff, blind to author identity).
7
8Built per `prd.md` through **M7**: EigenTrust + Claude end to end; LightGBM learned score
9with isotonic calibration; GraphSAGE trained offline and compared (not served — it doesn't
10beat M5 on this sparse graph, and the PRD says ship it only if it does); the
11attestation-gated sensitive-repo tier (6.13); AT-Proto writeback of assessments as records
12(6.11); the diff-embedding slop signal (6.12); a spoken `/brief` (ElevenLabs); and the
13Tangled browser overlay (7.4).
14
15## Layout
16
17```
18src/trust/
19 config.py env paths (DATA_ROOT fail-fast) + gate/eigen/review tuning
20 db.py DuckDB schema, feature view, clean_merge label SQL
21 ingest.py M1 Jetstream -> events -> derive typed tables (--probe confirms NSIDs)
22 eigentrust.py M3 SciPy power iteration + BFS path explanation (no graph DB)
23 review.py M4 Claude reviewer, verbatim 6.6 prompt, forced-schema tool use
24 fusion.py M4 gate decide() + scoring worker (score_pr); loads M5 model if present
25 learned.py M5 LightGBM + isotonic calibration + TreeSHAP (optional .[learned] extra)
26 gnn.py M6 GraphSAGE, trained offline + compared vs M5; served only if it wins (.[gnn])
27 atproto.py M7 writeback: assessments published as sh.tangled.trust.score records (6.11)
28 api.py M3/M4 FastAPI: /score /review /leaderboard /metrics /triage + pages
29src/trust/static/ triage / dashboard / leaderboard pages
30extension/ M7 Tangled browser overlay (7.4) — MV3 content script, UI only
31lexicons/ sh.tangled.trust.score lexicon for the writeback (6.11)
32 seed.py synthetic demo data (trusted core + sybil cluster)
33 static/ triage / dashboard / leaderboard pages (thin clients of the API)
34```
35
36## Setup
37
38```bash
39cp .envrc.example .envrc # point DATA_ROOT at the external drive; add ANTHROPIC_API_KEY
40source .envrc # in prod: fails fast if the drive is not mounted
41uv venv .venv && source .venv/bin/activate && uv pip install -e .
42```
43
44`DATA_ROOT` unset → a repo-local `.data/` dev fallback (with a warning). All large
45artifacts route under `DATA_ROOT` (PRD 4.1).
46
47## Demo (no live data or API key required)
48
49One command brings up the whole stack (seed → score loop → API) in split panes:
50
51```bash
52mprocs # reads mprocs.yaml; open http://127.0.0.1:8000
53```
54
55Or run the panes by hand:
56
57```bash
58python -m trust.seed # load the synthetic vouch graph + labelled PRs
59python -m trust.score --loop # poll + score PRs, write decisions (--loop for a daemon)
60python -m trust.api # serve http://127.0.0.1:8000 (triage / dashboard / leaderboard)
61```
62
63> DuckDB is single-writer and a held lock blocks every other open, so each process
64> opens the file briefly (open → work → close) with retry — that's what lets the
65> mprocs panes share one `trust.duckdb`. Don't run `ingest` and `score` as writers
66> at the same time.
67
68## Learned score (M5, optional)
69
70```bash
71uv pip install -e '.[learned]' # lightgbm + scikit-learn (no shap needed)
72python -m trust.seed
73trust-train # LightGBM on the features, isotonic-calibrated; prints a reliability curve
74python -m trust.score # the gate now uses calibrated P(clean), not raw EigenTrust
75```
76
77`trust-train` predicts `clean_merge` from the per-DID features (with `eigentrust_score`
78**as a feature**, so the model builds on the graph), splits by time, and fits isotonic
79regression so the output is a real probability (PRD 6.5/6.8). The model is saved under
80`MODEL_DIR`; `fusion.structural_for` loads it automatically and falls back to raw
81EigenTrust when it's absent (so the base install still runs). Explanations gain the top
82LightGBM **TreeSHAP** contributions (`merged_pr_count (+1.40)`, …) via LightGBM's native
83`pred_contrib` — no `shap`/`numba` dependency.
84
85> On the tiny synthetic data the model is near-degenerate (the reliability curve has two
86> bins; one revert sends a contributor to 0). That's expected at N≈22 — real history
87> smooths it. To use M5 in a running `mprocs` demo: `trust-train`, then restart the
88> `score` and `api` panes so they load the model.
89
90What it shows (the PRD deliverable):
91
92- `live/trusted-clean` — authored by **carol**, trust flows maintainer → alice → carol →
93 **fast-lane** on structural trust alone.
94- `live/sybil-buggy` — authored by a throwaway in an isolated mutual-vouch cluster,
95 starved to **0.000** → **needs_human**. A clean-looking diff could never lift it
96 (constraint 2). With `ANTHROPIC_API_KEY` set, Claude also attaches a concrete reason
97 (the diff swaps a constant-time compare for `==`).
98- Dashboard: score distribution, fast-lane rate, **0% false-approval** backtest above the
99 threshold, vouch-graph stats.
100
101## Live data
102
103```bash
104python -m trust.ingest --probe --max-events 300 # confirm real sh.tangled.* NSIDs first
105python -m trust.ingest # firehose -> DuckDB, resumable cursor
106python -m trust.score # score newly-ingested PRs
107```
108
109The collection→record map in `config.COLLECTION_KINDS` is best-guess and marked
110`CONFIRM` — verify it against the `--probe` output before trusting derived rows.
111
112## Tests
113
114```bash
115python -m pytest # eigentrust starves sybils; gate never lifts untrusted; schema parses
116```
117
118## GraphSAGE (M6, optional)
119
120```bash
121uv pip install -e '.[gnn]' # torch + torch-geometric (multi-GB)
122trust-seed && trust-train && trust-gnn # trains GraphSAGE offline, compares vs M5
123```
124
125`trust-gnn` builds a PyG graph (positive vouches + co-contribution edges; per-DID feature
126vectors as node features; denounce-count rides as a feature, no signed-edge GNN), trains an
127inductive 2-layer GraphSAGE on a time split, then writes a **verdict** comparing its holdout
128accuracy to M5's. `fusion.structural_for` serves the GNN **only if `gnn_wins`** — on the
129synthetic graph it loses to M5, so the system keeps the calibrated baseline. That gate is the
130PRD's rule ("ship the GNN only if it beats the baseline and is stable"), enforced in code.
131
132> lightgbm and torch each bundle `libomp`; loading both in one process hangs on macOS.
133> `trust/__init__.py` sets `KMP_DUPLICATE_LIB_OK` / `OMP_NUM_THREADS` before either imports.
134
135## Native + compliance surfaces (M7)
136
137- **Attestation-gated sensitive-repo tier (6.13).** A repo in the `sensitive` tier
138 requires a contributor-issued jurisdiction attestation before fast-lane/merge; a
139 missing one forces `needs_human` regardless of trust or content risk — the only control
140 that overrides the score, so it's checked first in `decide()`. The demo seeds a sensitive
141 repo where an attested DID fast-lanes and an unattested high-trust DID is blocked at
142 `calibrated_prob 1.00`. Only declared/asserted facts are used; nothing is inferred.
143- **AT-Proto writeback (6.11).** `trust-publish` emits each assessment as a public
144 `sh.tangled.trust.score` record (lexicon in `lexicons/`) on the service's own PDS, so
145 verdicts are auditable provenance on the network. No creds → dry-run (prints the records);
146 set `ATPROTO_PDS` / `ATPROTO_IDENTIFIER` / `ATPROTO_PASSWORD` to publish for real.
147- **Browser overlay (7.4).** `extension/` is a minimal MV3 content script that injects a
148 trust hat onto tangled.org from the same `/score` API. Load unpacked; see `extension/README.md`.
149 Confirm the DID selector against the real DOM (the UI analog of confirming NSIDs).
150- **Diff-embedding slop signal (6.12).** `trust-embed --build` embeds **every** scraped PR diff
151 (Featherless / `Qwen3-Embedding-4B`) into the `diff_vectors` table — idempotent and resumable
152 (`pr_id NOT IN diff_vectors`), so re-run it as `trust.backfill` keeps filling `pull_requests`,
153 or leave `trust-embed --build --watch` running to keep pace. Scoring then cosine-k-NNs each new
154 diff against the embeddings of *currently* known-bad PRs (`slop_score` joins `pr_labels`
155 `clean_merge=0`, so re-labelling never needs a re-embed) and hands the max similarity to Claude
156 as a `machine_findings` hint (advisory — surfaces in the explanation, never flips the gate).
157 Vector search stays inside DuckDB; no key → nothing embedded and the signal is just absent.
158- **Spoken briefing (M7).** `GET /brief/{did}` composes a speakable summary of the decision
159 (no DIDs read aloud) and returns `audio/mpeg` when `ELEVENLABS_API_KEY` is set, JSON text
160 otherwise. `trust.voice.brief_text` is the composer; reused by the API.
161
162## What's skipped (and when to add it)
163
164- **Per-PR writeback subject.** `sh.tangled.trust.score` currently keys on the contributor
165 DID; carry `pr_id` on the `scores` table to reference a specific PR's `at://` URI.
166- **SvelteKit frontend.** The three surfaces ship as built-in static pages (the PRD blesses
167 this for the dashboard); swap to SvelteKit if you need the richer UI kit / native overlay.
168- **More external signals (6.12): OSV/secret-scan/SAST.** `review_pr` already accepts
169 `machine_findings` (the slop similarity is the first one wired in) — add the scanners' output
170 to that same dict.