Now I have a clear picture. Here's what I found and where I see this going.
Current state of forges: The module fetches metadata for individual repos (FetchRepository, FetchTags) across GitHub, GitLab, Gitea/Forgejo, and Bitbucket. It has no org-level listing -- it doesn't know how to enumerate repos for an owner/org. But the underlying SDKs (go-github, gitea/sdk, gitlab client-go) all support listing org repos.
The gap: There's no way to go from "give me an org name" to "here are all the repos and their aggregated dependency data." The orgs/ directory is empty, and git-pkgs is entirely per-repo today -- one SQLite database per repository.
What forges would need to add: A ListRepositories(ctx, owner) method on the Forge interface (or a separate OrgForge interface to avoid breaking existing implementations). Each backend would page through the forge API and return []Repository.
What the orgs module could then do:
- Call
forges.ListRepositories to get all repos for an org
- Clone (shallow) or open each repo locally
- Run
git-pkgs indexing per repo (or read existing databases if already indexed)
- Aggregate into org-level views
The interesting org-level insights, pulling from the roadmap and the existing stats structure:
- Shared dependencies -- which packages appear across multiple repos, and at what versions
- "Which repos use package X" -- the inverse query, already called out in the roadmap
- Vulnerability exposure -- how many repos are affected by a given CVE
- License compliance -- org-wide license inventory, flag repos with problematic licenses
- Ecosystem breakdown -- how many npm vs gem vs pypi dependencies org-wide
- Staleness -- which repos have the most outdated deps
- Dependency churn -- which repos have the most dependency activity
A few design questions before going further: