prd.md at main · veikka.tngl.sh/sunstead

veikka.tngl.sh / sunstead
Fork 0
Sunstead trust scoring project
Fork 0
sunstead / prd.md
at main 456 lines 38 kB View raw View rendered
wrap content
Veikka Silvekoski Initial commit: sunstead trust scoring project 14hrs ago
c76a8b80
  1# PRD: hybrid contributor trust scoring for Tangled (GNN + Claude)
  2
  3You are building a backend service that scores the trustworthiness of contributors on **Tangled**, a code forge built on the **AT Protocol**. The score auto-triages incoming pull requests so maintainers who approve hundreds of PRs a day only review the ones that need a human. The score must be **calibrated** (a real probability), **explainable** (a maintainer can see why), and **adversarially robust** (resistant to throwaway identities pushing machine-generated low-quality code).
  4
  5The stack is deliberately lean and self-hosted: a single embedded **DuckDB** store on an external drive, plain Python processes, and no managed cloud data services. Read sections 0, 1, and 2 before writing any code. Build strictly in the order in section 5.
  6
  7---
  8
  9## 0. Mission
 10
 11Produce, per contributor DID, a calibrated probability that their next contribution is safe to fast-lane, plus a short human-readable reason, by fusing two independent signals:
 12
 13- **Structural trust (the "who"):** the contributor's position in the vouch graph and their historical track record. Sybil-resistant. Built with EigenTrust first, then optionally upgraded with a GNN.
 14- **Content review (the "what"):** Claude reading the actual diff and discussion of a specific PR to catch problems the graph cannot see.
 15
 16These are fused by a **policy/gate**, not a naive average. The output drives a decision: fast-lane, normal queue, or route to a human with a reason attached.
 17
 18---
 19
 20## 1. Threat model and hard constraints (non-negotiable)
 21
 22The attacker spins up fresh DIDs and pushes LLM-generated code that looks correct but is subtly wrong, to get it merged with minimal review. Every design choice exists to defeat this:
 23
 241. **The structural signal must be load-bearing and sybil-resistant.** Trust must flow from a trusted seed set; a cluster of fake DIDs vouching for each other must be starved. This is why EigenTrust (a trust-flow algorithm) is the core, not a vouch count.
 252. **Claude judges content, never identity.** Claude must not see or infer author reputation. A clean-looking diff from an untrusted DID must NOT lift that DID into the fast-lane. Identity is the graph's job; content is Claude's job.
 263. **The score must be calibrated.** 0.9 must mean roughly 90% of such contributors produce clean PRs.
 274. **The score must be explainable.** Emit a structured explanation (top factors plus Claude's rationale), never a bare number.
 285. **Inform, do not enforce.** Tangled's vouching has no punitive consequence; it informs a decision. This system recommends and routes; it does not block users.
 29
 30---
 31
 32## 2. Two decisions that shape the whole build
 33
 34**The stack is serviceless and self-hosted.** All state lives in a single embedded DuckDB file on an external drive. The ingester, scoring worker, and API are plain Python processes. There is no message broker, no separate database server, and no managed cloud data service. Models are trained offline; Anthropic serves the Claude inference call. This keeps the moving parts minimal and the footprint small, which suits both the hardware constraint and a hackathon timeline. The resumability a broker would give is already covered by the AT Protocol firehose itself: the Jetstream cursor lets you replay, and the raw event log in DuckDB is the durable record.
 35
 36**There is no graph database, and the agent must not add one.** You need graph *computation*, not a graph *engine*, and they are different layers. The vouch graph lives as a plain edge list in a `vouches` table in DuckDB. EigenTrust reads those rows into a SciPy sparse matrix and runs power iteration in memory. GraphSAGE builds a PyTorch Geometric `edge_index` tensor from the same query. At hackathon scale the graph is a few thousand edges and fits in memory many times over, so there is no performance case for Neo4j or any graph DB, and adding one only costs a service to run and a query language to wire up. Path-based explanations ("trust reaches this contributor through maintainers X and Y") are done with a short in-memory breadth-first walk from the seed during the EigenTrust run, not with graph-DB traversal.
 37
 38---
 39
 40## 3. Architecture
 41
 42```
 43        Jetstream (filtered AT Proto firehose, JSON over WebSocket)
 44                                |
 45                                v
 46                Ingester  (plain Python process)
 47        (confirm NSIDs; persist cursor; batched appends)
 48                                |
 49                                v
 50            DuckDB file  [on external drive: $DUCKDB_PATH]
 51        - events          (raw append log)
 52        - contributors
 53        - vouches         (edge list)   <- the whole graph; no graph DB
 54        - pull_requests   (lifecycle)
 55        - features        (SQL views / tables)
 56        - scores
 57        - ingest_state    (cursor)
 58                                |
 59            +-------------------+--------------------+
 60            |                                        |
 61            v                                        v
 62   STRUCTURAL SIGNAL                          CONTENT SIGNAL
 63   (reads the vouches edge list)              (Claude via Anthropic API)
 64   - EigenTrust (SciPy sparse)                - reviews a PR's diff +
 65   - LightGBM on features                       discussion; returns
 66   - GraphSAGE (PyG; trained offline,           structured risk + flags
 67     inference served in-process)               + rationale
 68            |                                        |
 69            +-------------------+--------------------+
 70                                |
 71                                v
 72                   FUSION POLICY / GATE  (section 6.7)
 73                                |
 74                                v
 75        Calibrated score + decision + explanation
 76                                |
 77                                v
 78   FastAPI (plain process)  ->  /score  /review  /leaderboard
 79                             +  built-in /dashboard (reads DuckDB)
 80                                |
 81                                v
 82   (stretch) write assessment back as an AT Proto record
 83```
 84
 85**Stack roles**
 86
 87- **DuckDB (single embedded store).** Holds everything: the raw event log, the curated tables (contributors, the vouch edge list, PR lifecycle, scores), and the feature views. One file on the external drive. Batch-append from the ingester (single writer); the API and the structural step read from it. Excellent at the analytical aggregations the features need.
 88- **DuckDB VSS extension or sqlite-vec (optional).** Diff-embedding k-NN for the slop-similarity angle: embed diffs and find near-duplicates of known-bad patterns. Keeps vector search serviceless, no separate search engine.
 89- **Built-in dashboard (recommended, on-theme).** The challenge is about observability and traceability, so the API serves a small static `/dashboard` page that reads DuckDB aggregates. Low effort and your demo centerpiece. A self-hosted Grafana plus Prometheus is an option for richer charts, but it adds services and disk, so default to the built-in page.
 90- **Plain Python processes.** The ingester, the scoring worker, and the FastAPI service. Run locally during the hackathon. For a hosted demo, one small VM or a single container; no managed platform required.
 91- **Offline training and Anthropic.** The GNN is trained offline with checkpoints on the drive and served in-process; Anthropic serves the Claude inference call.
 92
 93---
 94
 95## 4. Stack
 96
 97- **Language:** Python 3.11+ throughout (the GNN forces PyTorch; keep one language).
 98- **Ingest:** `websockets` against a public Jetstream instance; batched appends written directly to DuckDB.
 99- **Store:** DuckDB, embedded, a single file on the external drive (`$DUCKDB_PATH`), for the event log, curated tables, feature views, and scores. Optional DuckDB VSS extension (or sqlite-vec) for diff-embedding similarity.
100- **Structural:** NumPy/SciPy sparse for EigenTrust; PyTorch Geometric for the GNN.
101- **Learned baseline:** LightGBM; SHAP for explanations.
102- **Content:** Anthropic SDK. Default `claude-sonnet-4-6`; cheap pre-pass `claude-haiku-4-5-20251001`; escalate hard cases to `claude-opus-4-8`. Temperature 0. Force the output schema with tool use / structured outputs.
103- **Observability:** a built-in FastAPI `/dashboard` reading DuckDB; self-hosted Grafana plus Prometheus optional if you want richer charts.
104- **API and runtime:** FastAPI (Python), co-located with the scorer, run as a plain process. The SvelteKit frontend talks to it directly. If you want the API in your TS house style, a thin Hono (Bun) gateway can front the Python scorer, but the direct path avoids a cross-language hop for the hackathon. For a hosted demo, a single small VM or container.
105- **Frontend:** SvelteKit + Svelte 5 (runes), shipped via `@sveltejs/adapter-node`. UI kit bits-ui; styling Lightning CSS with the six-layer cascade; icons unplugin-icons + iconify; charts layerchart; tables tanstack/table-core; toasts svelte-sonner; validation zod. Full screen spec in section 7.
106- **Tooling (scoring service):** uv for deps, ruff for lint and format, ty for type checking, pytest for tests.
107
108### 4.1 Local disk: route every large file to the external drive
109
110The development machine is short on space, so all large local artifacts live on a mounted external drive, never on the home or system disk. This is cheap to enforce because the heavy local footprint is small and well contained: the data store is a single DuckDB file, and the rest is the Python and ML toolchain (torch plus torch-geometric are multi-GB), the model and embedding caches, and the transient backfill staging. Route all of it through a single `DATA_ROOT` env var.
111
112Set `DATA_ROOT` to the mounted drive and create the subtree once (macOS shown; on Linux use a path like `/mnt/ext/tangled-trust`):
113
114```bash
115# .envrc  (source this before running anything in the project)
116export DATA_ROOT="/Volumes/EXT/tangled-trust"   # the external drive
117mkdir -p "$DATA_ROOT"/{venv,pip,hf,torch,pyg,staging,diffs,models,duckdb,logs}
118
119# Python toolchain (the single biggest hog: torch + torch-geometric wheels)
120export PIP_CACHE_DIR="$DATA_ROOT/pip"
121export UV_CACHE_DIR="$DATA_ROOT/pip"            # if using uv
122# Create the venv ON the drive, not inside the repo:
123#   python -m venv "$DATA_ROOT/venv" && source "$DATA_ROOT/venv/bin/activate"
124
125# Model and embedding caches (GBs if you do local diff embeddings)
126export HF_HOME="$DATA_ROOT/hf"
127export TRANSFORMERS_CACHE="$DATA_ROOT/hf"
128export SENTENCE_TRANSFORMERS_HOME="$DATA_ROOT/hf"
129export TORCH_HOME="$DATA_ROOT/torch"
130
131# App paths, read from env, all defaulting under DATA_ROOT
132export DUCKDB_PATH="$DATA_ROOT/duckdb/trust.duckdb"   # primary data store
133export PYG_ROOT="$DATA_ROOT/pyg"               # PyTorch Geometric processed-dataset cache
134export STAGING_DIR="$DATA_ROOT/staging"        # Jetstream backfill dumps (NDJSON/Parquet)
135export DIFF_CORPUS_DIR="$DATA_ROOT/diffs"      # cached PR diffs/patches for eval and training
136export MODEL_DIR="$DATA_ROOT/models"           # GraphSAGE + LightGBM checkpoints, calibrators
137export LOG_DIR="$DATA_ROOT/logs"
138```
139
140What this covers, by component:
141
142- **The DuckDB file (`DUCKDB_PATH`):** the entire data store (event log, curated tables, features, scores) is one file on the drive, so the bulk of the data is on the external drive by design.
143- **Python venv and pip/uv cache:** the torch and torch-geometric wheels are the largest local cost; both the environment and the download cache live on the drive.
144- **PyG dataset cache (`PYG_ROOT`) and checkpoints (`MODEL_DIR`):** the GNN's cached graph tensors and saved weights from offline training.
145- **Hugging Face / sentence-transformers / torch-hub caches:** any local embedding model for the diff-similarity path (DuckDB VSS).
146- **`STAGING_DIR`:** the raw Jetstream backfill, written as NDJSON or Parquet before it is loaded into DuckDB. Transient but large during a full replay; write it to the drive and delete after load.
147- **`DIFF_CORPUS_DIR`:** cached PR patch text for the Claude eval fixture and any training set.
148
149Rules:
150
151- Every component reads these from env and must default its large-output paths under `DATA_ROOT`. Do not hardcode repo-relative or home-relative paths for anything that grows, including the DuckDB file.
152- At process startup, assert `DATA_ROOT` exists and is writable, and fail fast with a clear message if the drive is not mounted, so a half-run never scatters files (or the DuckDB file) onto the system disk.
153- Only the repo and the small `.env` (the Anthropic API key) stay on the main disk. The data store, the venv, and all caches are on the drive.
154- A USB external drive is slower than internal SSD, so DuckDB queries, PyG dataset processing, and disk-heavy steps run somewhat slower. At hackathon-scale data this is fine; keep the drive mounted for the whole run.
155
156---
157
158## 5. Build order (build in this exact order; each milestone must run before the next)
159
160- **M0 - Set up the local stack.** Mount the external drive, source `.envrc`, create the venv and the DuckDB file under `DATA_ROOT`, install dependencies. Verify `DATA_ROOT` is writable. No services to provision.
161- **M1 - Ingest.** Jetstream to DuckDB with a persisted cursor and historical backfill; a step derives typed rows from the raw event log. Confirm the exact Tangled collection names (6.1). Goal: events landing in DuckDB, resumable after a crash.
162- **M2 - Dataset.** Reconstruct PR lifecycles, mine the clean-merge label, build per-DID features as DuckDB SQL views or a batch job (6.2, 6.3).
163- **M3 - Structural baseline + end-to-end demo.** EigenTrust over the `vouches` table, a `/score/{did}` endpoint, and the triage queue plus leaderboard screens (section 7). After M3 you have a working, sybil-resistant, demoable system with zero ML training.
164- **M3.5 - Observability.** The dashboard screen (section 7) reading `/metrics`: trust-score distribution, fast-lane rate, false-approval budget, vouch-graph stats, and ingest lag. Operational telemetry (events/sec, API latency, Claude cost) goes to Prometheus + Grafana, not this screen. Low effort, directly on-theme, and your demo backdrop.
165- **M4 - Content layer + decisions.** Claude review component and the fusion gate (6.6, 6.7), optionally enriched with the code-security and supply-chain findings (6.12) as structured input to the reviewer. Now you have the full hybrid: EigenTrust + Claude.
166- **M5 - Learned score.** LightGBM on the features (with the EigenTrust score as a feature), calibrated (6.5, 6.8).
167- **M6 - GNN upgrade (stretch).** GraphSAGE trained offline; serve inference in-process; compare against M5. Ship only if it beats the baseline and is stable.
168- **M7 - Surfaces (stretch).** Write assessments back as AT Proto records (6.11), add the Tangled-native browser-extension overlay (section 7), the attestation-gated sensitive-repo tier (6.13), and/or an ElevenLabs voice briefing on the API.
169
170The GNN is M6 on purpose: on a new, sparsely vouched network it will likely not beat M5 and is the most likely thing to break mid-demo. Always have M4 working first.
171
172---
173
174## 6. Component specs
175
176### 6.1 Ingestion (Jetstream to DuckDB)
177
178Connect a websocket to a public Jetstream instance, filtered server-side to only the collections you need:
179
180```
181wss://jetstream2.us-east.bsky.network/subscribe?wantedCollections=sh.tangled.*
182```
183
184- Also subscribe to `app.bsky.graph.*` if you want the cross-ATmosphere social signal (follower graph, account age).
185- **Confirm the exact NSIDs. Do NOT hardcode guesses.** The Tangled lexicons live in the `tangled.org/tangled.org/core` repo; read them, and log a sample of live Jetstream events to see the real `collection` values for pull requests, vouches, CI/pipeline ("spindle") runs, issues, comments, and stars. Known facts: Tangled records live under `sh.tangled.*`; vouch/denounce records are public records on the issuer's PDS and each carries a reason; CI emits pull_request / push / manual pipeline events. Verify everything else against source.
186- **Writer:** DuckDB is single-writer and OLAP, so do NOT insert row-by-row from the socket handler. Buffer events in memory and append in batches to the `events` table (or stage them as Parquet under `STAGING_DIR` and load). A single ingester process owns the write path; everything else reads. A derive step turns the raw log into the typed tables (contributors, vouches, pull_requests).
187- **Cursor:** persist the `time_us` of the last processed event in an `ingest_state` row (or a small cursor file under `DATA_ROOT`). On reconnect, resume from that cursor minus a few seconds for gapless playback. An absent cursor means live-tail; a past cursor backfills, which is how you build the training history. This cursor plus the durable `events` log is the resumability that a broker would otherwise provide.
188- **Account and Identity events** arrive regardless of the collection filter; use Identity events to refresh a DID's handle and document.
189- Each event gives `did`, `time_us`, and a `commit` with `operation` (create/update/delete), `collection`, `rkey`, and the JSON `record`.
190
191### 6.2 Data model (all in the DuckDB file)
192
193Every table lives in the single DuckDB file at `$DUCKDB_PATH` on the external drive.
194
195- `events(did, time_us, operation, collection, rkey, record JSON, ...)` -- the raw append log, written in batches by the ingester
196- `contributors(did PK, handle, did_created_at, pds_host, first_seen)`
197- `vouches(voucher_did, subject_did, polarity int{+1,-1}, reason text, evidence_uri, created_at, weight)` -- this is the entire graph; no graph DB
198- `pull_requests(pr_id PK, author_did, repo, target, opened_at, ci_status, merged bool, merged_at, closed_unmerged bool, additions int, deletions int, files_touched int, diff_text, discussion_len int)`
199- `pr_followups(pr_id, reverted bool, patched_same_lines_within_n_days bool)`
200- `features` -- per-DID aggregates as a SQL view or a materialized table refreshed by a batch step (merged counts, revert rate, CI pass rate, diff-size stats, discussion length)
201- `scores(did, as_of, structural_trust, content_risk, calibrated_prob, decision, explanation_json)`
202- `ingest_state(stream, last_time_us)`
203
204### 6.3 Label mining (the supervised target)
205
206For each historical PR, derive a binary `clean_merge` label automatically:
207
208- **1 (clean):** merged AND CI passed AND not reverted AND the same lines not patched within N days (default N = 14).
209- **0 (not clean):** reverted, or closed unmerged, or repeated CI failure, or a quick follow-up fix to the same lines.
210- Drop PRs too recent for the N-day window to have elapsed.
211
212Aggregate to a per-DID signal. **Split by time, not randomly**, so you never train on the future.
213
214### 6.4 Structural signal: EigenTrust (required baseline)
215
216Read the edge list from the DuckDB `vouches` table, build a row-normalized sparse matrix, seed on the trusted maintainer DID(s), and run personalized power iteration:
217
218```python
219# SELECT voucher_did, subject_did, weight FROM vouches  -> build sparse C (n x n)
220# C[i][j] = normalized trust i places in j; rows sum to 1.
221# Edge weight before normalization: base 1.0, scaled up if the vouch carries
222#   PR evidence, scaled down by age (time decay).
223# p: seed vector, mass on the maintainer DID(s), normalized.
224# alpha: restart probability ~0.15.
225t = p.copy()
226for _ in range(50):
227    t = (1 - alpha) * (C.T @ t) + alpha * p
228    t = t / t.sum()
229# t[did] is the structural trust; expose it as a signal and as a model feature.
230```
231
232- **Denounces:** classic EigenTrust assumes a non-negative stochastic matrix. Keep it simple: a denounce zeroes trust into that node and is recorded as a negative node feature for the learned models. Do NOT make distrust flow transitively.
233- Seeding on the maintainer makes scores viewer-relative, matching Tangled's circle philosophy but propagated across the whole graph with decay.
234- **Path explanation:** during the run, keep the edge list in memory and do a short BFS from the seed to reconstruct the trust path for the explanation object. No graph DB.
235
236### 6.5 Learned signal: LightGBM, then GraphSAGE
237
238**LightGBM (M5, reliable):** predict `clean_merge` from per-DID features (read from the DuckDB `features` view). Include `eigentrust_score` as a feature so the model builds on the graph signal. Suggested features:
239
240```
241eigentrust_score, did_age_days, merged_pr_count, revert_rate, ci_pass_rate,
242close_without_merge_ratio, mean_diff_size, mean_files_touched, churn,
243mean_discussion_len, bsky_graph_degree, bsky_account_age, denounce_count
244```
245
246Trains in seconds, resists overfitting at small N far better than a net, and gives SHAP explanations. Save the model and calibrator under `MODEL_DIR`. Calibrate the output (6.8).
247
248**GraphSAGE GNN (M6, stretch upgrade):** an inductive node-classification model.
249
250```python
251# nodes: contributors, with the feature vector above as node features x
252# edges: built from the vouches table into a PyG edge_index tensor (positive,
253#        weighted), plus co-contribution edges; no graph DB involved
254# task: node-level binary classification against clean_merge
255# model: GraphSAGE, 2 layers, hidden 64, out 1; neighbor sampling (inductive)
256# train OFFLINE: BCEWithLogitsLoss on labeled nodes, temporal split;
257#   checkpoints + PyG cache under MODEL_DIR / PYG_ROOT on the drive
258# serve inference in-process: sigmoid(logit) -> structural_trust_gnn
259```
260
261- Use the inductive variant so it generalizes to unseen contributors (cold start).
262- **Signed edges:** either use a signed GNN (SignedGCN) or, simpler, keep the GNN on positive vouch edges and pass denounce-count as a node feature.
263- GNN explanations are weak; the human-facing explanation stays the SHAP factors and/or the EigenTrust path plus Claude's rationale.
264
265### 6.6 Content signal: Claude review
266
267Assesses ONE PR's actual content. **Cost gate:** do not call the expensive model on every PR.
268
269- `structural_trust >= T_HIGH`: skip the Sonnet review unless the diff touches security-sensitive paths.
270- `T_LOW <= structural_trust < T_HIGH` (ambiguous band): run the review. This is where Claude earns its keep.
271- `structural_trust < T_LOW`: run the review to attach a concrete reason for the human.
272- Optionally run a 1-call Haiku pre-pass everywhere to decide whether a Sonnet review is warranted.
273
274**Input:** the diff, PR title and description, and discussion text, truncated to a token budget. **No author identity, handle, or history.**
275
276**Model:** `claude-sonnet-4-6`, temperature 0, output forced to the JSON schema via tool use.
277
278**System prompt for this component (use verbatim):**
279
280```
281You are a code-contribution reviewer for an open-source trust system. You assess ONE
282pull request's actual content for quality and safety. You do not decide whether to
283merge; you produce a structured risk assessment that a separate policy layer combines
284with an identity-trust signal.
285
286Hard rules:
287- Judge only the artifact in front of you: the diff, the PR title and description, and
288  the discussion. You are given NO information about the author's identity, reputation,
289  or history, and you must not speculate about it. Identity trust is handled elsewhere.
290- Your job is to catch problems a reputation signal cannot see: code that looks correct
291  but is subtly wrong, plausible-looking machine-generated filler ("slop"),
292  security-sensitive changes, leaked secrets or credentials, license violations, and
293  changes whose stated intent does not match what the code does.
294- Prefer flagging uncertainty over approving. If the diff is large, unclear, or you
295  cannot verify correctness, say so and set review_recommended. Never rubber-stamp.
296- Be specific. Every flag must point to concrete lines or patterns, not vibes.
297- Output ONLY the structured object specified by the tool. No prose outside it.
298```
299
300**Output schema (tool use):**
301
302```json
303{
304  "content_risk": "float 0.0 (clearly safe/trivial) to 1.0 (clearly broken or dangerous)",
305  "flags": [
306    {
307      "type": "subtle_bug | slop | security | secret_leak | license | intent_mismatch | untested | oversized | other",
308      "severity": "low | med | high",
309      "location": "file and/or line reference",
310      "explanation": "concrete reason tied to the code"
311    }
312  ],
313  "summary": "1-3 sentence plain-language rationale, suitable to read aloud to a maintainer",
314  "review_recommended": "boolean"
315}
316```
317
318### 6.7 Fusion and decision policy (a gate, not an average)
319
320```python
321def decide(structural_trust, content, cfg):
322    # structural_trust: calibrated P(clean) in [0,1]
323    # content: dict from 6.6, or None if no Claude call was made
324    risk      = 0.0   if content is None else content["content_risk"]
325    review    = False if content is None else content["review_recommended"]
326    high_flag = bool(content) and any(f["severity"] == "high" for f in content["flags"])
327
328    if structural_trust < cfg.T_LOW or risk >= cfg.R_HIGH or high_flag:
329        return "needs_human", build_reason(structural_trust, content)
330    if structural_trust >= cfg.T_HIGH and risk <= cfg.R_LOW and not review:
331        return "fast_lane", build_reason(structural_trust, content)
332    return "normal_queue", build_reason(structural_trust, content)
333```
334
335- **Displayed score:** start from the calibrated structural P(clean), then penalize for content flags. A low structural score can never be lifted into fast-lane by clean-looking content (constraint 2).
336- Thresholds are config. Set `T_HIGH` from calibration so the historical false-approval rate above it stays under your chosen budget. Write every decision to the DuckDB `scores` table.
337
338### 6.8 Calibration
339
340Hold out a time-based split. Fit isotonic regression (or Platt scaling) mapping the raw model score to an empirical P(clean). Report a reliability curve. The fast-lane threshold then corresponds to a concrete false-approval budget.
341
342### 6.9 Explainability
343
344Emit a structured `explanation` per score: the top SHAP feature contributions (LightGBM) or the dominant EigenTrust path from the in-memory BFS ("vouched by trusted maintainers X, Y; 34 merged PRs; 0 reverts"), plus Claude's `summary` and any flags when a review ran. This is also what a voice layer would read aloud.
345
346### 6.10 API and runtime
347
348Run as plain Python processes.
349
350- `GET /score/{did}` -> `{ calibrated_prob, structural_trust, content_risk?, decision, explanation, top_factors }`
351- `POST /review/pr` -> body `{ diff, title, description, discussion }`, runs 6.6, returns the schema object.
352- `GET /leaderboard` -> contributors ranked by calibrated_prob.
353- `GET /metrics` -> aggregate JSON for the dashboard: score distribution, fast-lane rate, false-approval rate, vouch-graph stats, ingest lag. The UI (section 7) renders it; the API serves JSON only.
354- A scoring worker (a separate process or a loop) picks up new PR records (poll the `events` table for unprocessed PRs, or have the ingester hand them off in-process), runs `decide(...)`, and writes results to `scores`. No message broker.
355- Optionally cache hot scores in-process; no separate cache service.
356- For a hosted demo, package the processes into a single container or run them on one small VM.
357
358### 6.11 AT Proto-native output (stretch, but what the judges reward)
359
360Give the service its own DID. Write each assessment back as a public record on its PDS (its own lexicon, referencing the PR's `at://` URI), so verdicts are auditable provenance on the network, not rows in a private file. Consume state from the firehose; emit state as records. This is the difference between a native ATProto integration and a tool that happens to read Tangled.
361
362### 6.12 External data sources (additional signals)
363
364All of these are public and either contribution-based or track-record-based, fetched on demand and cached. None requires probing a contributor or correlating identity. Each is a weak, advisory feature, never a determination.
365
366Code-security and supply-chain (feed the content-risk signal in 6.6 and the gate in 6.7). This targets the malware half of the brief that the trust graph alone does not cover:
367
368- Vulnerability databases: cross-reference every dependency a PR adds or bumps against OSV.dev, the GitHub Advisory Database, and NVD/CVE, through an index like deps.dev.
369- Malicious-package and typosquat signals: flag dependencies that are newly published, low-download, or near-misses of popular names (the classic supply-chain shape), using registry publish age and download stats.
370- Secret scanning on the diff (gitleaks or betterleaks) for leaked keys and credentials.
371- SAST on the diff (Semgrep rules or CodeQL) for dangerous constructs.
372- License data (SPDX) on added files and dependencies, for license violations.
373
374Hand these machine findings to the Claude reviewer (6.6) as structured input, so it reasons over concrete evidence instead of judging code in a vacuum.
375
376Verifiable track record (feed the structural features in 6.5). Strong and hard to fake, but use only links the contributor publicly declares; inferring an undeclared one is the deanonymization line in 6.13:
377
378- Package-registry maintainer history: npm, PyPI, and crates.io tenure, publish history, and download scale for packages they maintain.
379- OpenSSF Scorecard and repo-health metrics for repos they own.
380- Commit signing: verified SSH/GPG or Sigstore signatures, for cryptographic attribution provenance.
381
382ATmosphere identity depth (feed the structural and DID-provenance features in 6.5). Your best native sybil signal, because the DID is shared across apps:
383
384- Participation across other AT Protocol apps under the same DID (blogs, Frontpage, Smoke Signal, and others), with the age and breadth of that footprint. A DID woven through the ATmosphere for years is expensive to fake; a fresh one tied to a single app is the attacker's profile.
385- Verified links in the DID document: a domain-verified did:web, a DNS-verified handle, self-declared verified accounts.
386
387Timezone consistency (a feature, not a location). Derive a coarse activity-timezone band from commit UTC offsets and posting times, which are already in the data, and use it only as a coherence check: a contributor whose declared context, vouch neighborhood, and commit timezone disagree is worth a second look. Never emit it as a location claim.
388
389### 6.13 Provenance, jurisdiction, and repo tiering
390
391The regulatory question is not "where is this contributor" but "is this contribution safe to trust," and jurisdiction, where it genuinely matters, comes from verification, not inference.
392
393- Verified jurisdiction by assertion: a contributor-issued jurisdiction attestation (a signed record), a verified organizational DID with a known jurisdiction, or a domain-verified did:web on an organization domain. This is the only jurisdiction source a compliance reviewer accepts, and a VPN cannot defeat it. Inference clears neither bar (accuracy against a VPN, lawful use against non-consenting third parties), so the system does not attempt it.
394- Repo tiering is the actual control, mirroring how export control works by controlling the artifact and the access rather than surveilling the person:
395  - Public or civilian tier: open; the trust-graph triage in 6.7 is sufficient.
396  - Sensitive or dual-use tier: a valid jurisdiction attestation is required before a contribution can be fast-laned or merged. A missing attestation forces `needs_human` regardless of structural trust or content risk.
397- The weak hints in 6.12 (PDS host, DID method, handle TLD, locale, timezone) are fed to the model as features; none is treated as a jurisdiction determination.
398
399This whole layer uses only what a contributor publicly declares or cryptographically asserts. The system never infers or correlates real-world identity or location (see the non-goal in section 8): no IP geolocation, no OSINT location-finding, no cross-platform profile matching, no fingerprinting, no stylometric deanonymization. That is both a legal constraint for an EU operator handling third-party personal data and a fit with the DID and pseudonymity model the platform rests on.
400
401---
402
403## 7. User-facing surfaces (UI)
404
405The scoring service is the brain. Every UI is a thin client that reads the API (`/score`, `/leaderboard`, `/metrics`) and never touches the DuckDB file directly. Two surfaces ship as your own SvelteKit app; one is a native overlay.
406
407Frontend stack: SvelteKit + Svelte 5 with runes, shipped via `@sveltejs/adapter-node`. UI kit bits-ui; styling Lightning CSS with the six-layer cascade (`@layer reset, tokens, base, components, utilities, overrides`); icons unplugin-icons + iconify; charts layerchart; tables tanstack/table-core; toasts svelte-sonner; validation zod. Server state via tanstack query is optional at this scale.
408
409### 7.1 Triage queue (the product, route `/`)
410
411The maintainer's open PRs across their repos, grouped by decision into fast-lane, needs review, and flagged. Each row shows the contributor avatar and handle, the PR title with repo and number, the calibrated score as a pill colored by decision (success / warning / danger), and a one-line reason. Rows expand to the breakdown from the explanation object (6.9): the structural side (the EigenTrust path and top factors) and the content side (Claude's flags and summary). Render the list with tanstack/table-core, sortable and filterable by repo and bucket, with a metric-card strip on top (open, fast-lane, needs review, flagged). Per-row actions: approve a fast-lane row, or pull one into your review anyway. Approving can call Tangled's API to merge, or simply record the action. The decision and the reason come straight from the gate (6.7); the UI renders them, it does not decide.
412
413### 7.2 Observability dashboard (route `/dashboard`, milestone M3.5)
414
415The trust view and your demo backdrop, reading `/metrics`: a score-distribution histogram (layerchart), the fast-lane rate, the false-approval rate from the backtest, vouch-graph stats (contributors, edges, seed), and ingest lag. Keep operational telemetry off this screen. Events per second, API latency, and Claude call cost and latency go to Prometheus + Grafana from your self-hosted stack, and Langfuse can trace the Claude review calls for per-call eval.
416
417### 7.3 Leaderboard (route `/leaderboard`)
418
419Contributors ranked by calibrated trust, the playful nod to the Tangled push-leaderboard tradition. tanstack/table-core, sortable. Cheap to build and good demo candy.
420
421### 7.4 Tangled-native overlay (stretch, the native surface)
422
423A thin browser extension whose content script injects the trust hat and Claude's note onto tangled.org PR and contributor pages, reading the same `/score` API client-side. It is UI only; the brain stays in the service. Build it as a minimal content script with your TS toolchain (Bun build, oxlint and oxfmt, zod to parse the response). This lands inline placement without waiting on Tangled to merge anything. The upstream version, Tangled's own appview rendering third-party trust records natively, is the vision, not the build; ask Lewis whether the appview can render trust records authored by other DIDs.
424
425Build placement: the triage queue and leaderboard land with M3 once `/score` exists, the dashboard with M3.5, and the extension overlay with M7.
426
427---
428
429## 8. Guardrails and non-goals
430
431Do:
432- Keep the structural signal sybil-resistant and load-bearing.
433- Keep Claude blind to author identity; combine via the gate, not an average.
434- Calibrate the score and tie the threshold to a false-approval budget.
435- Confirm Tangled NSIDs from source and live stream; never hardcode guesses.
436- Keep the stack serviceless and embedded: one DuckDB file on the external drive, plain processes; resumability comes from the Jetstream cursor plus the raw event log, not a message broker.
437- Run Claude at temperature 0 with forced schema, and gate calls by cost.
438- Keep the brain in the scoring service; the SvelteKit UI and the extension are thin clients that read the API, never the DuckDB file.
439- Write every large artifact (the DuckDB file, venv, caches, staging, checkpoints, diffs) under `DATA_ROOT` on the external drive; never on the home or system disk, and fail fast if the drive is not mounted.
440- For jurisdiction where it genuinely matters, require a contributor-issued attestation or verified DID, never an inference, and gate sensitive-tier repos on it (6.13).
441
442Do not:
443- Add a graph database. Edges are rows; graph compute is in-memory (SciPy / PyG).
444- Add a message broker, a separate database server, or a managed cloud data service; the embedded store is enough at this scale.
445- Train the GNN online; train it offline and serve inference in-process.
446- Block, ban, or punish users; this system informs and routes only.
447- Infer or correlate real-world identity or location: no IP geolocation, no OSINT location-finding, no cross-platform profile matching (LinkedIn and similar), no browser or network fingerprinting, no stylometric deanonymization. Use only what a contributor publicly declares or cryptographically asserts.
448- Let clean content fast-lane an untrusted DID.
449- Make denounces propagate transitively.
450- Ship the GNN unless it beats the calibrated LightGBM baseline and is stable.
451
452---
453
454## 9. Deliverable
455
456A running FastAPI scoring service backed by an embedded DuckDB store on the external drive, built from real Tangled data via Jetstream, fronted by a SvelteKit app with a triage queue, an observability dashboard, and a leaderboard, exposing calibrated and explained trust scores and fast-lane / human-review decisions, with EigenTrust + Claude working end to end (M4) before any GNN work. The browser-extension overlay onto Tangled PR pages is the stretch native surface. Include a short demo script that scores a few real contributors and shows one PR routed to a human with Claude's reason and one fast-laned on structural trust.
Configure Feed

Configure Feed