Monorepo for Tangled tangled.org
5

Configure Feed

Select the types of activity you want to include in your feed.

docs: init bobbin

Lewis: May this revision serve well! <lewis@tangled.org>

author
Lewis
committer
Tangled
date (Jun 18, 2026, 2:20 PM +0300) commit 1f3d6fdc parent e4346367 change-id rqmsotwt
+236
+236
docs/DOCS.md
··· 1725 1725 }; 1726 1726 ``` 1727 1727 1728 + # Bobbin 1729 + 1730 + Bobbin is an API appview for Tangled records. It serves XRPC 1731 + endpoints for `sh.tangled.*`, with it you can get repos, 1732 + issues, pulls, comments, follows, stars, labels, pipelines, 1733 + and profiles. It is read-only, there is no auth, since that 1734 + should all be handled direct-to-PDS and knot respectively. 1735 + 1736 + **Bobbin has no permanent storage**. 1737 + 1738 + It is only a glorified edge index, in the graph theory 1739 + sense. Additionally it has a record cache, re-filled on 1740 + demand. All other data that Bobbin serves comes live from 1741 + PDSes & knots. 1742 + 1743 + ## What Bobbin needs 1744 + 1745 + The way that Bobbin is able to pull off being 1746 + so stateless is by moving state upstream. 1747 + Primarily it depends on an instance of 1748 + [Hydrant](https://tangled.org/did:plc:6v3ul2ptnqctyxwkz5ti4amn) 1749 + , which is the service that gives an event stream 1750 + for Bobbin to quickly backfill from on every restart. 1751 + Backfilling ought to take less than a couple of minutes 1752 + maximum. If the upstream instance of Hydrant fails 1753 + while Bobbin is live, its list/count endpoints stop 1754 + advancing and report a stale cursor. Single-lookups 1755 + will continue working, due to the second dependency: 1756 + [Slingshot](https://tangled.org/did:plc:c7mc2fn47ihdihul4vjwsuy3/tree/main/slingshot). 1757 + Slingshot fetches individual records & resolves identities. 1758 + If the upstream instance of Slingshot fails, single-lookups 1759 + will fail with a `502` error. There are some aggregation 1760 + endpoints that use Slingshot for hydrating, which will also 1761 + fail. 1762 + 1763 + A soft dependency that ought to exist for Bobbin to operate 1764 + correctly is simply the plethora of knots that are out 1765 + there, that Bobbin talks to directly for git data and, for 1766 + knots at v1.15+, members & collaborators. 1767 + 1768 + ## Building Bobbin 1769 + 1770 + Bobbin is under [Tangled's core monorepo, under bobbin/](https://tangled.org/did:plc:j5hmlfdrwkvtxm7cjmu7j2is/tree/master/bobbin). 1771 + Here's an easy local debug-build: 1772 + 1773 + ```sh 1774 + cargo build -p bobbin 1775 + ``` 1776 + 1777 + Bobbin loves being in a container. When using 1778 + `bobbin/containerfiles/bobbin.Containerfile`, it runs `cargo 1779 + build --release --bin bobbin --package bobbin` within a 1780 + little Debian runtime, exposing port 8090. 1781 + 1782 + ## Configuration 1783 + 1784 + The best way to configure Bobbin is via a toml config file. 1785 + There's an `example.toml` in [Bobbin's subdir](https://tangled.org/did:plc:j5hmlfdrwkvtxm7cjmu7j2is/blob/master/bobbin/example.toml). 1786 + Every value is overridable by a `BOBBIN_*` env var. 1787 + The load order is env, then `--config <path>`, then 1788 + `/etc/bobbin/config.toml`, then built-in defaults. 1789 + 1790 + Load and check a config without starting the server: 1791 + 1792 + ```sh 1793 + bobbin --config config.toml validate 1794 + ``` 1795 + 1796 + Minimal config is the two upstream URLs. The hydrant URL 1797 + takes `ws://` or `wss://`. An `http://` or `https://` 1798 + URL is rewritten to the matching websocket scheme at 1799 + connection-time. 1800 + 1801 + ```toml 1802 + [server] 1803 + binds = ["127.0.0.1:8090"] 1804 + 1805 + # Loopback-only & can leave empty to disable debug introspection. 1806 + debug_bind = "127.0.0.1:8091" 1807 + 1808 + [hydrant] 1809 + url = "https://hydrant.example.com" 1810 + 1811 + [slingshot] 1812 + url = "https://slingshot.example.com" 1813 + ``` 1814 + 1815 + > 🦪 Lewis 1816 + > 1817 + > At time of writing, we (Tangled) don't host public 1818 + > instances of Hydrant or Slingshot. You will have to 1819 + > find public instances or spin these up yourself! :P 1820 + 1821 + Take a gander in the project's example.toml for an 1822 + exhaustive list of things to configure. 1823 + 1824 + You will discover fun things such as a configurable adaptive 1825 + loop that watches the cgroup memory limit & throttles heavy 1826 + requests under pressure. It only works if it detects a 1827 + cgroup limit is present. The config for that is in the 1828 + `[backpressure]` block of the config template. 1829 + 1830 + ## Running Bobbin 1831 + 1832 + Start the server using a config toml: 1833 + 1834 + ```bash 1835 + bobbin --config config.toml 1836 + ``` 1837 + Bobbin wakes up in a cold sweat and immediately gets to 1838 + work: 1839 + 1. It binds its listeners, connects to the Hydrant stream 1840 + in the background. 1841 + 2. It serves requests from the first 1842 + moment it's alive, even before the Hydrant stream connects 1843 + or finishes catching up. Having a cold Hydrant itself 1844 + costs only latency and approximate counts. 1845 + 1846 + ## The API 1847 + 1848 + **Single lookups** take a record's AT-URI. 1849 + 1850 + - `getRepo` takes the repo URI: 1851 + 1852 + ```sh 1853 + curl "$BOBBIN/xrpc/sh.tangled.repo.getRepo?repo=at://did:plc:boltless/sh.tangled.repo/squid" 1854 + ``` 1855 + ```json 1856 + { 1857 + "uri": "at://did:plc:boltless/sh.tangled.repo/squid", 1858 + "cid": "bafyrei...", 1859 + "value": { "$type": "sh.tangled.repo", "knot": "knot1.tangled.sh", "description": "...", "createdAt": "..." } 1860 + } 1861 + ``` 1862 + 1863 + - `getProfile` takes the full profile record URI, so a bare 1864 + handle or DID will not resolve: 1865 + 1866 + ```sh 1867 + curl "$BOBBIN/xrpc/sh.tangled.actor.getProfile?actor=at://did:plc:boltless/sh.tangled.actor.profile/self" 1868 + ``` 1869 + 1870 + - If Slingshot cannot serve the record, the response is `502`: 1871 + 1872 + ```json 1873 + { "error": "UpstreamFailed", "message": "upstream unavailable: ..." } 1874 + ``` 1875 + 1876 + **Aggregation** endpoints come in `list*` and `count*` pairs, 1877 + each with a `*By` sibling, and require a `subject` query param. 1878 + 1879 + - `listRepos` and `countRepos` key on the owner DID: 1880 + 1881 + ```sh 1882 + curl "$BOBBIN/xrpc/sh.tangled.repo.countRepos?subject=did:plc:boltless" 1883 + ``` 1884 + ```json 1885 + { "count": 7, "distinctAuthors": 1 } 1886 + ``` 1887 + 1888 + ```sh 1889 + curl "$BOBBIN/xrpc/sh.tangled.repo.listRepos?subject=did:plc:boltless&limit=3" 1890 + ``` 1891 + ```json 1892 + { "items": [ { "uri": "at://did:plc:boltless/sh.tangled.repo/squid", "cid": "bafyrei...", "value": { } } ], "cursor": null } 1893 + ``` 1894 + 1895 + - Bobbin validates the subject per collection. Here a repo URI 1896 + is passed where a bare DID is required, so the call returns a 1897 + `400`: 1898 + 1899 + ```sh 1900 + curl "$BOBBIN/xrpc/sh.tangled.graph.listFollows?subject=at://did:plc:boltless/sh.tangled.repo/squid" 1901 + ``` 1902 + ```json 1903 + { "error": "InvalidRequest", "message": "invalid request: subject must be a bare did, got at-uri with collection sh.tangled.repo" } 1904 + ``` 1905 + 1906 + **Search** is a single endpoint over an in-mem full-text 1907 + index: 1908 + 1909 + ```sh 1910 + curl "$BOBBIN/xrpc/sh.tangled.search.query?q=tangled&limit=2" 1911 + ``` 1912 + ```json 1913 + { "hits": [ { "uri": "at://...", "cid": "...", "nsid": "sh.tangled.repo", "score": 27.1, "value": { } } ], "cursor": null } 1914 + ``` 1915 + 1916 + **Git data** such as blob, tree, diff, log, and archive proxies 1917 + straight to the repo's knot, streamed back without caching. 1918 + 1919 + ## Coverage and warm-up 1920 + 1921 + - While the edge index is catching up from Hydrant, 1922 + the aggregation count is a lower bound & may still climb. 1923 + - One endpoint reports how far along the backfill it is: 1924 + 1925 + ```sh 1926 + curl "$BOBBIN/xrpc/sh.tangled.bobbin.getCoverage" 1927 + ``` 1928 + 1929 + While warming up: 1930 + 1931 + ```json 1932 + { "ready": false, "eventsProcessed": 45588, "lastCursor": 51658 } 1933 + ``` 1934 + 1935 + Once caught up, Bobbin flips to ready: 1936 + 1937 + ```json 1938 + { "ready": true, "eventsProcessed": 106085, "lastCursor": 116527 } 1939 + ``` 1940 + 1941 + If starting up Hydrant for the first time, Hydrant itself 1942 + will take a decent while (a couple of hours) to backfill 1943 + from PDSes. Hydrant stores its backfill on disk. Bobbin 1944 + restart reaches `ready` in minutes by replaying event from 1945 + an already-populated Hydrant. If your Hydrant is new, expect 1946 + Bobbin to backfill in that same couple of hours that Hydrant 1947 + takes. 1948 + 1949 + ## Loose ends and not-gonna-impl 1950 + 1951 + - **No coverage signal for per-knot rosters yet.** 1952 + Coverage tracks the hydrant stream only. A v1.15 knot 1953 + that is unreachable serves a stale or empty member set 1954 + with nothing to flag it. 1955 + - **Knot eventstream fan-out isn't pooled.** 1956 + Bobbin opens one websocket per v1.15 1957 + knot on top of the hydrant subscription. A network with 1958 + thousands of knots wants pooling or a shared subscription. 1959 + - **No sequential issue or PR numbers.** bobbin returns rkeys, 1960 + not `#42` style ids like the web appview. A client 1961 + deriving a display number does it from creation order. But 1962 + why bother? rkeys are the IDs. 1963 + 1728 1964 # Hacking on Tangled 1729 1965 1730 1966 We highly recommend [installing