Tangled contributor trust scoring (EigenTrust + Claude)#

Calibrated, explainable, sybil-resistant trust scores that auto-triage Tangled PRs into fast-lane / normal-queue / needs-human. Two independent signals fused by a gate (not an average): structural trust (EigenTrust over the vouch graph) and content review (Claude reading the diff, blind to author identity).

Built per prd.md through M7: EigenTrust + Claude end to end; LightGBM learned score with isotonic calibration; GraphSAGE trained offline and compared (not served — it doesn't beat M5 on this sparse graph, and the PRD says ship it only if it does); the attestation-gated sensitive-repo tier (6.13); AT-Proto writeback of assessments as records (6.11); the diff-embedding slop signal (6.12); a spoken /brief (ElevenLabs); and the Tangled browser overlay (7.4).

Layout#

src/trust/
  config.py      env paths (DATA_ROOT fail-fast) + gate/eigen/review tuning
  db.py          DuckDB schema, feature view, clean_merge label SQL
  ingest.py      M1 Jetstream -> events -> derive typed tables (--probe confirms NSIDs)
  eigentrust.py  M3 SciPy power iteration + BFS path explanation (no graph DB)
  review.py      M4 Claude reviewer, verbatim 6.6 prompt, forced-schema tool use
  fusion.py      M4 gate decide() + scoring worker (score_pr); loads M5 model if present
  learned.py     M5 LightGBM + isotonic calibration + TreeSHAP (optional .[learned] extra)
  gnn.py         M6 GraphSAGE, trained offline + compared vs M5; served only if it wins (.[gnn])
  atproto.py     M7 writeback: assessments published as sh.tangled.trust.score records (6.11)
  api.py         M3/M4 FastAPI: /score /review /leaderboard /metrics /triage + pages
src/trust/static/  triage / dashboard / leaderboard pages
extension/         M7 Tangled browser overlay (7.4) — MV3 content script, UI only
lexicons/          sh.tangled.trust.score lexicon for the writeback (6.11)
  seed.py        synthetic demo data (trusted core + sybil cluster)
  static/        triage / dashboard / leaderboard pages (thin clients of the API)

Setup#

cp .envrc.example .envrc      # point DATA_ROOT at the external drive; add ANTHROPIC_API_KEY
source .envrc                 # in prod: fails fast if the drive is not mounted
uv venv .venv && source .venv/bin/activate && uv pip install -e .

DATA_ROOT unset → a repo-local .data/ dev fallback (with a warning). All large artifacts route under DATA_ROOT (PRD 4.1).

Demo (no live data or API key required)#

One command brings up the whole stack (seed → score loop → API) in split panes:

mprocs            # reads mprocs.yaml; open http://127.0.0.1:8000

Or run the panes by hand:

python -m trust.seed            # load the synthetic vouch graph + labelled PRs
python -m trust.score --loop    # poll + score PRs, write decisions (--loop for a daemon)
python -m trust.api             # serve http://127.0.0.1:8000  (triage / dashboard / leaderboard)

DuckDB is single-writer and a held lock blocks every other open, so each process opens the file briefly (open → work → close) with retry — that's what lets the mprocs panes share one trust.duckdb. Don't run ingest and score as writers at the same time.

Learned score (M5, optional)#

uv pip install -e '.[learned]'   # lightgbm + scikit-learn (no shap needed)
python -m trust.seed
trust-train                      # LightGBM on the features, isotonic-calibrated; prints a reliability curve
python -m trust.score            # the gate now uses calibrated P(clean), not raw EigenTrust

trust-train predicts clean_merge from the per-DID features (with eigentrust_score as a feature, so the model builds on the graph), splits by time, and fits isotonic regression so the output is a real probability (PRD 6.5/6.8). The model is saved under MODEL_DIR; fusion.structural_for loads it automatically and falls back to raw EigenTrust when it's absent (so the base install still runs). Explanations gain the top LightGBM TreeSHAP contributions (merged_pr_count (+1.40), …) via LightGBM's native pred_contrib — no shap/numba dependency.

On the tiny synthetic data the model is near-degenerate (the reliability curve has two bins; one revert sends a contributor to 0). That's expected at N≈22 — real history smooths it. To use M5 in a running mprocs demo: trust-train, then restart the score and api panes so they load the model.

What it shows (the PRD deliverable):

live/trusted-clean — authored by carol, trust flows maintainer → alice → carol → fast-lane on structural trust alone.
live/sybil-buggy — authored by a throwaway in an isolated mutual-vouch cluster, starved to 0.000 → needs_human. A clean-looking diff could never lift it (constraint 2). With ANTHROPIC_API_KEY set, Claude also attaches a concrete reason (the diff swaps a constant-time compare for ==).
Dashboard: score distribution, fast-lane rate, 0% false-approval backtest above the threshold, vouch-graph stats.

Live data#

python -m trust.ingest --probe --max-events 300   # confirm real sh.tangled.* NSIDs first
python -m trust.ingest                             # firehose -> DuckDB, resumable cursor
python -m trust.score                              # score newly-ingested PRs

The collection→record map in config.COLLECTION_KINDS is best-guess and marked CONFIRM — verify it against the --probe output before trusting derived rows.

Tests#

python -m pytest        # eigentrust starves sybils; gate never lifts untrusted; schema parses

GraphSAGE (M6, optional)#

uv pip install -e '.[gnn]'   # torch + torch-geometric (multi-GB)
trust-seed && trust-train && trust-gnn   # trains GraphSAGE offline, compares vs M5

trust-gnn builds a PyG graph (positive vouches + co-contribution edges; per-DID feature vectors as node features; denounce-count rides as a feature, no signed-edge GNN), trains an inductive 2-layer GraphSAGE on a time split, then writes a verdict comparing its holdout accuracy to M5's. fusion.structural_for serves the GNN only if gnn_wins — on the synthetic graph it loses to M5, so the system keeps the calibrated baseline. That gate is the PRD's rule ("ship the GNN only if it beats the baseline and is stable"), enforced in code.

lightgbm and torch each bundle libomp; loading both in one process hangs on macOS. trust/__init__.py sets KMP_DUPLICATE_LIB_OK / OMP_NUM_THREADS before either imports.

Native + compliance surfaces (M7)#

Attestation-gated sensitive-repo tier (6.13). A repo in the sensitive tier requires a contributor-issued jurisdiction attestation before fast-lane/merge; a missing one forces needs_human regardless of trust or content risk — the only control that overrides the score, so it's checked first in decide(). The demo seeds a sensitive repo where an attested DID fast-lanes and an unattested high-trust DID is blocked at calibrated_prob 1.00. Only declared/asserted facts are used; nothing is inferred.
AT-Proto writeback (6.11). trust-publish emits each assessment as a public sh.tangled.trust.score record (lexicon in lexicons/) on the service's own PDS, so verdicts are auditable provenance on the network. No creds → dry-run (prints the records); set ATPROTO_PDS / ATPROTO_IDENTIFIER / ATPROTO_PASSWORD to publish for real.
Browser overlay (7.4). extension/ is a minimal MV3 content script that injects a trust hat onto tangled.org from the same /score API. Load unpacked; see extension/README.md. Confirm the DID selector against the real DOM (the UI analog of confirming NSIDs).
Diff-embedding slop signal (6.12). trust-embed --build embeds every scraped PR diff (Featherless / Qwen3-Embedding-4B) into the diff_vectors table — idempotent and resumable (pr_id NOT IN diff_vectors), so re-run it as trust.backfill keeps filling pull_requests, or leave trust-embed --build --watch running to keep pace. Scoring then cosine-k-NNs each new diff against the embeddings of currently known-bad PRs (slop_score joins pr_labels clean_merge=0, so re-labelling never needs a re-embed) and hands the max similarity to Claude as a machine_findings hint (advisory — surfaces in the explanation, never flips the gate). Vector search stays inside DuckDB; no key → nothing embedded and the signal is just absent.
Spoken briefing (M7). GET /brief/{did} composes a speakable summary of the decision (no DIDs read aloud) and returns audio/mpeg when ELEVENLABS_API_KEY is set, JSON text otherwise. trust.voice.brief_text is the composer; reused by the API.

What's skipped (and when to add it)#

Per-PR writeback subject. sh.tangled.trust.score currently keys on the contributor DID; carry pr_id on the scores table to reference a specific PR's at:// URI.
SvelteKit frontend. The three surfaces ship as built-in static pages (the PRD blesses this for the dashboard); swap to SvelteKit if you need the richer UI kit / native overlay.
More external signals (6.12): OSV/secret-scan/SAST. review_pr already accepts machine_findings (the slop similarity is the first one wired in) — add the scanners' output to that same dict.

Configure Feed