take a look at plan.md, it was a conversation, i'd like to turn it into more of a concrete plan, first read it and tell me what you think
This is a really well-developed concept that emerged organically from the conversation. Here's what I see as the core architecture:
The system: A fleet of specialized AI bots, each with their own Forgejo account and working directory, maintaining forks of critical neglected open source packages. They coordinate entirely through the forge (issues, PRs, labels, comments) the same way human developers would.
The bots:
Key design decisions already made:
What's strong: The "forge as orchestrator" idea eliminates custom infrastructure. The one-way wall is politically smart. The audit-first approach generates useful data even before fixes land. Application canaries as the real compatibility test is the thing that would actually build trust.
What's underspecified: How each bot actually runs (the Claude Code loop mechanics, context management, cost control), the persona prompts themselves, what "critical and neglected" means concretely (thresholds from ecosyste.ms), the Forgejo instance setup, and the order you'd build this in.
What I'd push back on slightly: The conversation got enthusiastic about scale (500 packages, cross-ecosystem) before nailing the single-package prototype. The concrete plan should probably start with one package and two or three bots, prove the interaction patterns work, then expand.
Want me to turn this into an actual buildable plan? I'd structure it around phases - starting with infrastructure setup and a minimal prototype, then expanding bot roles and package count.
yes, need a list of todos including expanding on the bot taks and personas, will start but running things locally on my laptop
Let me check what else is in the repo to understand what's already been started.
Just the plan file. Let me write this up.
Done. The plan keeps the spirit of the conversation but restructures it into something you can actually work from. The bot personas each have a personality, pronouns, and a concrete task list. The infrastructure and prototype sections are ordered by what you'd do first.
A few things I made calls on that you might want to change:
now make a bots folder and make a file for each one
Six bot persona files in bots/. Each one has role, personality, first-pass audit tasks, ongoing tasks, labels, and review stance — enough to use as a system prompt when running the bot.
Now let me think through the skills. These are the capabilities the bots need to do their work — some shared, some role-specific.
Here's the structure:
skills/
shared/
forgejo.md - API/CLI interaction with the forge (all bots)
git.md - branching, committing, pushing (all bots)
devlog.md - maintaining the running work log (all bots)
standup.md - async standup participation (all bots)
ecosystems.md - querying ecosyste.ms for package data (all bots)
code-editing.md - reading and modifying source code (all except forkbot)
forkbot/
repo-management.md - forking, scaffolding, labels, registry publishing
package-evaluation.md - deciding what to fork based on criticality × neglect
securibot/
vulnerability-scanning.md - CVE scanning, OSV queries, audit tools
signing-and-provenance.md - sigstore, trusted publishing, SBOMs
depbot/
dependency-analysis.md - tree mapping, maintenance signals, upgrade impact
testbot/
test-runner.md - running suites, downstream consumers, canary apps
test-generation.md - generating tests from usage patterns and coverage gaps
docbot/
docs-generation.md - API docs, changelogs, migration guides from code
metadata-maintenance.md - CITATION.cff, FUNDING.yml, REUSE compliance
perfbot/
performance-analysis.md - anti-patterns, benchmarking, bloat detection
Each skill file describes what the capability is, how the bot uses it, and what tools or approaches it relies on. The shared skills are things every bot needs. The bot-specific skills are the domain expertise that makes each bot's perspective distinct.
is there liscense bot?
Not currently — docbot handles the metadata side (checking license files exist, REUSE compliance, license headers) and forkbot checks license compatibility before forking. But those are both surface-level checks.
A dedicated license bot could go deeper: verifying every file has correct headers, checking that dependency licenses are compatible with each other, flagging license changes in upstream updates, generating proper SPDX expressions, catching the subtle stuff like "this file was copied from a GPL project into an MIT package."
Want me to add one? It's the kind of thing where the boring-but-useful version (headers, compliance, REUSE) is easy to get right, and the harder version (compatibility analysis across the dependency tree) is where it gets genuinely valuable.
yeah
Now let me pull the license/REUSE stuff out of docbot since licensebot owns that now.
And update plan.md to include licensebot.
Done. Licensebot gets his own persona file and a license-compliance skill. Pulled the REUSE/license stuff out of docbot's tasks and metadata skill so there's no overlap. Docbot still handles CITATION.cff, FUNDING.yml, and general metadata — licensebot owns everything about actual license correctness and compliance.
any other bots we might need?
Looking at the current lineup, the main gap I see is CI/build infrastructure. Forkbot sets up initial CI scaffolding, but nobody owns it after that. Who makes sure the package builds across multiple runtime versions? Who updates the CI config when a GitHub Action is deprecated? Who notices that the build matrix should include Node 22 now?
A buildbot could own:
Beyond that, I think the current seven cover it well. You could argue for an i18n bot or an accessibility bot but those are niche enough that they'd be idle most of the time on backend/library packages.
The other direction is consolidation — perfbot is the weakest of the bunch. Its tasks overlap with testbot (benchmarking) and depbot (unnecessary dependencies). If you wanted to stay lean for the prototype you could fold perfbot's concerns into those two and add it later if the gap shows up in practice.
So: maybe add buildbot, maybe drop perfbot for now. Or keep the roster as-is and see what the prototype tells you. Want me to create buildbot?
good idea add build bot, keep perfbot
Done. Eight bots now:
devlops will be done opening issues on a shared repo, and also team meeting notes on an issue with comments
Let me update the devlog skill and the bot personas to reflect that.
Now update the standup skill to match.
Now update the plan.md infrastructure section to include the shared repos.
Good. Now let me address your feedback. Several good points — let me handle them.
Good feedback. Let me add licensebot and buildbot to the bot summaries in plan.md, add the cold-start cycle instructions, and add priority heuristics per bot.
Now add the cold-start cycle to each bot's persona file. Let me do them all in parallel.
Let me add the cycle instructions to each bot persona. I'll add it as a section after the role description in each file.
Done. Three things updated:
plan.md now has licensebot and buildbot persona summaries, plus a "Bot cycle" section with the cold-start sequence and per-bot priority heuristics.
Every bot persona file now has a "Cycle start" section with the exact steps to follow at the beginning of each run — read devlog, read tracking issues, check standup, check labelled issues, prioritise, work, write devlog, post standup.
Devlogs and standups are now issues on shared repos rather than files, which was your earlier ask. The devlog repo and standup repo are both in the infrastructure checklist.
The feedback about starting with Ruby — I'd resolve that open question now and just pick Ruby. You'll spot bad output immediately and you know which gems are critical-but-neglected.
we'll use packages.ecosyste.ms api to find critical but neglected ruby projects
Let me resolve that open question in the plan.
Now let me update the ecosystems skill to be specific about how we'll use the API.
Now let me check what the API actually looks like so the skill doc is accurate.
There's a /critical endpoint and a critical boolean filter. Let me update the skill doc.
Updated. The ecosystems skill now documents the actual API including the critical boolean filter and the /critical and /critical/sole_maintainers endpoints. The candidate-finding flow starts with critical=true on rubygems, sorted by dependents, then filters for staleness — which should surface exactly the kind of packages botberg is designed for.
let's try querying that api to find some candidates
Some of these are actively maintained. Let me filter for the stale ones — packages where the last release is old.
Now those are interesting. Let me get more detail on the top candidates.
Here's what stands out:
| Package | Deps | Last Release | Open Issues | Notes |
|---|---|---|---|---|
| simplecov | 13,481 | Dec 2022 | 142 | Some commit activity but no release in 2+ years. 7 maintainers, MIT. Good size. |
| guard-rspec | 4,640 | Jul 2016 | 49 | Nearly 10 years without a release. Last commit 2021. Effectively dead. |
| rest-client | 4,231 | Aug 2019 | 139 | No release in 6+ years. 453M downloads. Classic neglected critical package. |
| eventmachine | 1,267 | May 2018 | 203 | 8 years stale, 487K dependent repos, 203 open issues. The poster child for neglected infrastructure. |
| awesome_print | 1,840 | Mar 2021 | 108 | Stale but more of a dev tool than critical infrastructure. |
| colorize | 2,234 | Jun 2023 | 5 | GPL-2.0 — license would complicate things for many consumers. Sole maintainer though. |
For a first botberg target, simplecov or rest-client seem best. Simplecov has the most dependents by far and enough open issues to give every bot real work. Rest-client is more stale but the codebase is probably messier after 6 years of drift.
Simplecov might be the ideal first package — MIT license, huge dependent count, active enough that someone might actually notice the fork, but clearly stalled on releases despite having maintainers.
What do you think? Or want me to dig deeper into any of these?
let's go with simplecov
Good pick. 13K dependents, 142 open issues, no release in over three years despite having 7 listed maintainers. Plenty for every bot to sink their teeth into.
Next step is getting Forgejo running locally and setting up the accounts. Want to start on that?
yeah let's set up forgejo via docker
Docker's running. Let me set up a docker-compose file for Forgejo and start a config log.
different port please, i use 3000 and 4000
Mapped to port 5000 now. Ready to start it?
5000 is taken too. Let me check what's free.