alpha
Login
or
Join now
tangled.org
/
core
Star
6
Fork
66
Atom
Configure Feed
Issues
Pull Requests
Commits
Tags
Feed URL
Select the types of activity you want to include in your feed.
Monorepo for Tangled
tangled.org
Star
6
Fork
66
Atom
Configure Feed
Issues
Pull Requests
Commits
Tags
Feed URL
Select the types of activity you want to include in your feed.
Overview
Issues
12
Pulls
26
Pipelines
core
/
knotmirror
/
xrpc
/
gitea
/
at
1a6e30dd0e4e77ad3f68cc1c824c59a544bbfffe
3 files
Seongmin Lee
knotmirror: performant language indexer
5w ago
817fd58d
batch.go
knotmirror: performant language indexer `git.listLanguages` has been one of the method that fails most often. Opening multiple git repos simultaneously can eaily cause OOM and language indexing itself usually takes super long. So several changes: - use `gitea.CatFileBatch` instead of go-git to avoid OOM - skip files larger than 16KB - sync HEAD ref language stats in knotmirror db - cache language stats info by commits (30d TTL) When syncing HEAD ref language stats, we do indexing on background. KnotMirror maintains internal "repo_stats_update" queue and right after `doResync` is done, enqueue the language stat indexing job so we can pre-index the language stats of HEAD ref. It's ok to spam this queue because all later events will be eventually ignored as we are resolving HEAD lazily. Signed-off-by: Seongmin Lee <git@boltless.me>
1 month ago
blob.go
knotmirror: prefer `cat-file --buffer` for frequent endpoints ideally we should use this everywhere and completely remove the go-git dependency. go-git consumes a lot of memory for large repos because it loads pack index into heap memory. This is preferred way of other go based git forges like go-git. lots of code are copied from go-git implementation and slightly modified to match tangled's data model. Signed-off-by: Seongmin Lee <git@boltless.me>
1 month ago
gitea.go
knotmirror: prefer `cat-file --buffer` for frequent endpoints ideally we should use this everywhere and completely remove the go-git dependency. go-git consumes a lot of memory for large repos because it loads pack index into heap memory. This is preferred way of other go based git forges like go-git. lots of code are copied from go-git implementation and slightly modified to match tangled's data model. Signed-off-by: Seongmin Lee <git@boltless.me>
1 month ago