README.md at main · veikka.tngl.sh/sunstead

Sunstead trust scoring project
sunstead / README.md
at main 170 lines 9.4 kB View raw View rendered
wrap content
Veikka Silvekoski Update sunstead: new modules (embed, voice, content, diffs, merged, vouchsafe), web UI, docs, scorer Dockerfile 11hrs ago
3df319f5
  1# Tangled contributor trust scoring (EigenTrust + Claude)
  2
  3Calibrated, explainable, sybil-resistant trust scores that auto-triage Tangled PRs
  4into **fast-lane / normal-queue / needs-human**. Two independent signals fused by a
  5gate (not an average): **structural trust** (EigenTrust over the vouch graph) and
  6**content review** (Claude reading the diff, blind to author identity).
  7
  8Built per `prd.md` through **M7**: EigenTrust + Claude end to end; LightGBM learned score
  9with isotonic calibration; GraphSAGE trained offline and compared (not served — it doesn't
 10beat M5 on this sparse graph, and the PRD says ship it only if it does); the
 11attestation-gated sensitive-repo tier (6.13); AT-Proto writeback of assessments as records
 12(6.11); the diff-embedding slop signal (6.12); a spoken `/brief` (ElevenLabs); and the
 13Tangled browser overlay (7.4).
 14
 15## Layout
 16
 17```
 18src/trust/
 19  config.py      env paths (DATA_ROOT fail-fast) + gate/eigen/review tuning
 20  db.py          DuckDB schema, feature view, clean_merge label SQL
 21  ingest.py      M1 Jetstream -> events -> derive typed tables (--probe confirms NSIDs)
 22  eigentrust.py  M3 SciPy power iteration + BFS path explanation (no graph DB)
 23  review.py      M4 Claude reviewer, verbatim 6.6 prompt, forced-schema tool use
 24  fusion.py      M4 gate decide() + scoring worker (score_pr); loads M5 model if present
 25  learned.py     M5 LightGBM + isotonic calibration + TreeSHAP (optional .[learned] extra)
 26  gnn.py         M6 GraphSAGE, trained offline + compared vs M5; served only if it wins (.[gnn])
 27  atproto.py     M7 writeback: assessments published as sh.tangled.trust.score records (6.11)
 28  api.py         M3/M4 FastAPI: /score /review /leaderboard /metrics /triage + pages
 29src/trust/static/  triage / dashboard / leaderboard pages
 30extension/         M7 Tangled browser overlay (7.4) — MV3 content script, UI only
 31lexicons/          sh.tangled.trust.score lexicon for the writeback (6.11)
 32  seed.py        synthetic demo data (trusted core + sybil cluster)
 33  static/        triage / dashboard / leaderboard pages (thin clients of the API)
 34```
 35
 36## Setup
 37
 38```bash
 39cp .envrc.example .envrc      # point DATA_ROOT at the external drive; add ANTHROPIC_API_KEY
 40source .envrc                 # in prod: fails fast if the drive is not mounted
 41uv venv .venv && source .venv/bin/activate && uv pip install -e .
 42```
 43
 44`DATA_ROOT` unset → a repo-local `.data/` dev fallback (with a warning). All large
 45artifacts route under `DATA_ROOT` (PRD 4.1).
 46
 47## Demo (no live data or API key required)
 48
 49One command brings up the whole stack (seed → score loop → API) in split panes:
 50
 51```bash
 52mprocs            # reads mprocs.yaml; open http://127.0.0.1:8000
 53```
 54
 55Or run the panes by hand:
 56
 57```bash
 58python -m trust.seed            # load the synthetic vouch graph + labelled PRs
 59python -m trust.score --loop    # poll + score PRs, write decisions (--loop for a daemon)
 60python -m trust.api             # serve http://127.0.0.1:8000  (triage / dashboard / leaderboard)
 61```
 62
 63> DuckDB is single-writer and a held lock blocks every other open, so each process
 64> opens the file briefly (open → work → close) with retry — that's what lets the
 65> mprocs panes share one `trust.duckdb`. Don't run `ingest` and `score` as writers
 66> at the same time.
 67
 68## Learned score (M5, optional)
 69
 70```bash
 71uv pip install -e '.[learned]'   # lightgbm + scikit-learn (no shap needed)
 72python -m trust.seed
 73trust-train                      # LightGBM on the features, isotonic-calibrated; prints a reliability curve
 74python -m trust.score            # the gate now uses calibrated P(clean), not raw EigenTrust
 75```
 76
 77`trust-train` predicts `clean_merge` from the per-DID features (with `eigentrust_score`
 78**as a feature**, so the model builds on the graph), splits by time, and fits isotonic
 79regression so the output is a real probability (PRD 6.5/6.8). The model is saved under
 80`MODEL_DIR`; `fusion.structural_for` loads it automatically and falls back to raw
 81EigenTrust when it's absent (so the base install still runs). Explanations gain the top
 82LightGBM **TreeSHAP** contributions (`merged_pr_count (+1.40)`, …) via LightGBM's native
 83`pred_contrib` — no `shap`/`numba` dependency.
 84
 85> On the tiny synthetic data the model is near-degenerate (the reliability curve has two
 86> bins; one revert sends a contributor to 0). That's expected at N≈22 — real history
 87> smooths it. To use M5 in a running `mprocs` demo: `trust-train`, then restart the
 88> `score` and `api` panes so they load the model.
 89
 90What it shows (the PRD deliverable):
 91
 92- `live/trusted-clean` — authored by **carol**, trust flows maintainer → alice → carol →
 93  **fast-lane** on structural trust alone.
 94- `live/sybil-buggy` — authored by a throwaway in an isolated mutual-vouch cluster,
 95  starved to **0.000** → **needs_human**. A clean-looking diff could never lift it
 96  (constraint 2). With `ANTHROPIC_API_KEY` set, Claude also attaches a concrete reason
 97  (the diff swaps a constant-time compare for `==`).
 98- Dashboard: score distribution, fast-lane rate, **0% false-approval** backtest above the
 99  threshold, vouch-graph stats.
100
101## Live data
102
103```bash
104python -m trust.ingest --probe --max-events 300   # confirm real sh.tangled.* NSIDs first
105python -m trust.ingest                             # firehose -> DuckDB, resumable cursor
106python -m trust.score                              # score newly-ingested PRs
107```
108
109The collection→record map in `config.COLLECTION_KINDS` is best-guess and marked
110`CONFIRM` — verify it against the `--probe` output before trusting derived rows.
111
112## Tests
113
114```bash
115python -m pytest        # eigentrust starves sybils; gate never lifts untrusted; schema parses
116```
117
118## GraphSAGE (M6, optional)
119
120```bash
121uv pip install -e '.[gnn]'   # torch + torch-geometric (multi-GB)
122trust-seed && trust-train && trust-gnn   # trains GraphSAGE offline, compares vs M5
123```
124
125`trust-gnn` builds a PyG graph (positive vouches + co-contribution edges; per-DID feature
126vectors as node features; denounce-count rides as a feature, no signed-edge GNN), trains an
127inductive 2-layer GraphSAGE on a time split, then writes a **verdict** comparing its holdout
128accuracy to M5's. `fusion.structural_for` serves the GNN **only if `gnn_wins`** — on the
129synthetic graph it loses to M5, so the system keeps the calibrated baseline. That gate is the
130PRD's rule ("ship the GNN only if it beats the baseline and is stable"), enforced in code.
131
132> lightgbm and torch each bundle `libomp`; loading both in one process hangs on macOS.
133> `trust/__init__.py` sets `KMP_DUPLICATE_LIB_OK` / `OMP_NUM_THREADS` before either imports.
134
135## Native + compliance surfaces (M7)
136
137- **Attestation-gated sensitive-repo tier (6.13).** A repo in the `sensitive` tier
138  requires a contributor-issued jurisdiction attestation before fast-lane/merge; a
139  missing one forces `needs_human` regardless of trust or content risk — the only control
140  that overrides the score, so it's checked first in `decide()`. The demo seeds a sensitive
141  repo where an attested DID fast-lanes and an unattested high-trust DID is blocked at
142  `calibrated_prob 1.00`. Only declared/asserted facts are used; nothing is inferred.
143- **AT-Proto writeback (6.11).** `trust-publish` emits each assessment as a public
144  `sh.tangled.trust.score` record (lexicon in `lexicons/`) on the service's own PDS, so
145  verdicts are auditable provenance on the network. No creds → dry-run (prints the records);
146  set `ATPROTO_PDS` / `ATPROTO_IDENTIFIER` / `ATPROTO_PASSWORD` to publish for real.
147- **Browser overlay (7.4).** `extension/` is a minimal MV3 content script that injects a
148  trust hat onto tangled.org from the same `/score` API. Load unpacked; see `extension/README.md`.
149  Confirm the DID selector against the real DOM (the UI analog of confirming NSIDs).
150- **Diff-embedding slop signal (6.12).** `trust-embed --build` embeds **every** scraped PR diff
151  (Featherless / `Qwen3-Embedding-4B`) into the `diff_vectors` table — idempotent and resumable
152  (`pr_id NOT IN diff_vectors`), so re-run it as `trust.backfill` keeps filling `pull_requests`,
153  or leave `trust-embed --build --watch` running to keep pace. Scoring then cosine-k-NNs each new
154  diff against the embeddings of *currently* known-bad PRs (`slop_score` joins `pr_labels`
155  `clean_merge=0`, so re-labelling never needs a re-embed) and hands the max similarity to Claude
156  as a `machine_findings` hint (advisory — surfaces in the explanation, never flips the gate).
157  Vector search stays inside DuckDB; no key → nothing embedded and the signal is just absent.
158- **Spoken briefing (M7).** `GET /brief/{did}` composes a speakable summary of the decision
159  (no DIDs read aloud) and returns `audio/mpeg` when `ELEVENLABS_API_KEY` is set, JSON text
160  otherwise. `trust.voice.brief_text` is the composer; reused by the API.
161
162## What's skipped (and when to add it)
163
164- **Per-PR writeback subject.** `sh.tangled.trust.score` currently keys on the contributor
165  DID; carry `pr_id` on the `scores` table to reference a specific PR's `at://` URI.
166- **SvelteKit frontend.** The three surfaces ship as built-in static pages (the PRD blesses
167  this for the dashboard); swap to SvelteKit if you need the richer UI kit / native overlay.
168- **More external signals (6.12): OSV/secret-scan/SAST.** `review_pr` already accepts
169  `machine_findings` (the slop similarity is the first one wired in) — add the scanners' output
170  to that same dict.
Configure Feed

Configure Feed