Implement the following plan:
Store git objects and refs in Postgres tables. Two implementations: pure SQL (PL/pgSQL functions, works on managed Postgres) and a C extension (native types, faster parsing). A libgit2-based bridge program registers Postgres as the storage backend so standard git push/fetch/clone work against the database.
gitgres/
Makefile # top-level, delegates to sub-makes
sql/
schema.sql # tables, indexes
functions/
object_hash.sql
object_read_write.sql
tree_parse.sql
commit_parse.sql
ref_manage.sql
views/
queryable.sql # materialized views
ext/ # Postgres C extension
Makefile # PGXS
gitgres.c # PG_MODULE_MAGIC, init
git_oid_type.c # 20-byte OID type with operators
sha1_hash.c # C SHA1 via OpenSSL
tree_parse.c # C tree entry parser
gitgres.control
sql/gitgres--0.1.sql # CREATE TYPE, CREATE FUNCTION
backend/ # libgit2 pluggable backend
Makefile
main.c # CLI entry point (serves repos)
odb_postgres.c # git_odb_backend implementation
odb_postgres.h
refdb_postgres.c # git_refdb_backend implementation
refdb_postgres.h
writepack_postgres.c # packfile receive handling
import/
gitgres-import.sh # shell script: git plumbing + psql
test/
test_helper.rb
schema_test.rb
object_hash_test.rb
object_store_test.rb
tree_parse_test.rb
commit_parse_test.rb
ref_test.rb
roundtrip_test.rb # push then clone, diff working trees
Depends on pgcrypto (available on all managed Postgres). OIDs stored as bytea (20 bytes raw).
git_object_hash(type, content) -- SHA1 of <type> <size>\0<content> using pgcrypto digest()git_object_write(repo_id, type, content) -- compute hash, INSERT ON CONFLICT DO NOTHING, return oidgit_object_read(repo_id, oid) -- return (type, size, content)git_object_read_prefix(repo_id, prefix, prefix_len) -- abbreviated OID lookupgit_tree_entries(content) -- parse binary tree into (mode, name, entry_oid) rowsgit_commit_parse(content) -- parse commit into tree_oid, parent_oids[], author fields, committer fields, messagegit_ref_update(repo_id, name, new_oid, old_oid, force) -- compare-and-swap with SELECT FOR UPDATEgit_ls_tree_r(repo_id, tree_oid, prefix) -- recursive tree walk returning (mode, path, oid, type)This is the bridge. Implement git_odb_backend and git_refdb_backend structs that store/retrieve from Postgres via libpq. libgit2 handles all protocol work (pack negotiation, delta resolution, ref advertisement).
odb_postgres.c)Implements these callbacks from git_odb_backend:
read(data, len, type, backend, oid) -- SELECT type, size, content FROM objects WHERE repo_id=$1 AND oid=$2read_header(len, type, backend, oid) -- SELECT type, size FROM objects WHERE ...read_prefix(full_oid, data, len, type, backend, short_oid, prefix_len) -- prefix match on oid columnwrite(backend, oid, data, len, type) -- INSERT INTO objects ... ON CONFLICT DO NOTHINGexists(backend, oid) -- SELECT 1 FROM objects WHERE ...exists_prefix(full_oid, backend, short_oid, prefix_len) -- prefix existence checkforeach(backend, cb, payload) -- SELECT oid FROM objects WHERE repo_id=$1, call cb for eachwritepack(writepack_out, backend, odb, progress_cb, payload) -- return a git_odb_writepack that accumulates pack bytes, then on commit delegates to libgit2's indexer to extract objects and calls write for eachfree(backend) -- close PG connectionThe backend struct holds a PGconn* and the repo_id.
refdb_postgres.c)Implements these callbacks from git_refdb_backend:
exists(exists_out, backend, ref_name) -- SELECT 1 FROM refs WHERE ...lookup(ref_out, backend, ref_name) -- SELECT oid, symbolic FROM refs WHERE ..., construct git_referenceiterator(iter_out, backend, glob) -- SELECT name, oid, symbolic FROM refs WHERE name LIKE ..., return custom iteratorwrite(backend, ref, force, who, message, old_id, old_target) -- CAS update using a transaction with SELECT FOR UPDATErename(ref_out, backend, old_name, new_name, force, who, message) -- UPDATE refs SET name=$1del(backend, ref_name, old_id, old_target) -- DELETE with CAS checkhas_log / ensure_log / reflog_read / reflog_write / reflog_rename / reflog_delete -- operate on the reflog tablelock / unlock -- use Postgres advisory locks (pg_advisory_xact_lock)free(backend) -- close connectionmain.c)A program that serves one or more repos. For testing, it can run as a one-shot helper that git invokes. Longer term it could serve HTTP smart protocol or the git:// protocol.
Initial approach: implement a git-remote-gitgres helper. When you do git clone gitgres::postgres://localhost/mydb/myrepo, git invokes git-remote-gitgres postgres://localhost/mydb/myrepo. The helper:
libgit2 has git_transport_register for custom transports, or we can use the simpler git_remote_callbacks approach.
ext/)The Postgres extension, separate from the libgit2 backend. Adds performance for SQL queries.
git_oid type: 20-byte fixed binary, hex I/O, btree + hash operator classes. Replaces bytea columns with a proper type so you can write WHERE oid = 'abc123...'.git_object_hash_c(type, content) RETURNS git_oid: C SHA1 via OpenSSL, ~10x faster than pgcryptogit_tree_entries_c(content) RETURNS SETOF (mode, name, oid): C pointer arithmetic instead of PL/pgSQL byte walkingBuilt with PGXS, links against OpenSSL.
import/gitgres-import.sh)Shell script that imports an existing git repo into the database using git plumbing commands and psql.
Usage: gitgres-import <repo-path> <conninfo> <repo-name>
git -C <repo> rev-list --objects --all to get all object OIDsgit -C <repo> cat-file --batch to get type + contentgit -C <repo> show-ref to get all refs, INSERT into refs tableSchema + object hash + read/write functions -- the foundation. Verify that git_object_hash(3, 'hello') matches git hash-object --stdin.
Tree + commit parsing functions -- test against real objects from a git repo using git cat-file.
Import script -- import a small test repo (~50 commits). Verify object count matches git rev-list --objects --all | wc -l.
Materialized views -- REFRESH after import. Write example queries against commits and trees.
libgit2 ODB backend -- implement read/write/exists. Test by creating a libgit2 repo with the Postgres backend, writing objects through it, verifying they appear in the database.
libgit2 refdb backend -- implement CRUD + iterator. Test ref creation and lookup.
git-remote-gitgres helper -- wire up libgit2 transport so git push/clone work.
Roundtrip test -- push a repo, clone it elsewhere, diff the working trees.
C extension -- git_oid type, C hash, C tree parse. Benchmark against PL/pgSQL.
Scale test -- import homebrew-core.
Minitest. Each test connects to a gitgres_test database, runs schema setup in a BEGIN, rolls back in teardown. SQL functions tested by comparing output against git CLI on the same objects.
Roundtrip integration test: create temp repo with commits, push via the backend, clone into new directory, diff -r.
git_oid type.If you need specific details from before exiting plan mode (like exact code snippets, error messages, or content you generated), read the full transcript at: /Users/andrew/.claude/projects/-Users-andrew-code-gitgres/0d86fef0-4a2d-473c-bb7e-29e502603055.jsonl
Let me start by checking the current state of the project and then work through the build order.
The server is PG 16 (Postgres.app) but the pg_config client is PG 14 (Homebrew). For the C extension we need matching headers, but for everything else it doesn't matter much. Let me check what's available and get started.
Good, PG 18 is available via Postgres.app and libgit2 is installed. Let me set up for PG 16 (the running server) and create the project structure.
Let me create the project structure and start with the SQL foundation.
Now let me create all the files. I'll work on multiple in parallel where possible.
Now let me create the test database and verify the SQL functions work.
Let me verify the hash function matches git's output.
Hashes match. Now let me load the views and verify the write/read cycle works.
Now let me clean up the test data and finish task 1, then start on the import script, tests, and libgit2 backend in parallel.
Now let me look at the libgit2 1.9 API to make sure I write the backend correctly.
Let me check a few more key headers and then start writing the C code and tests in parallel.
Let me check a few more headers I need for the writepack and reflog implementation.
Good, I have a clear picture of all the APIs now. Let me also check the git_odb_new non-experimental API and the repository wrapping API.
Now let me check a couple more things and then write all the C files.
Now I have everything I need. Let me write all the C backend files, the import script, and the tests. I'll use parallel agents for the larger files.
Good, the C backend files are written. Now let me write the import script, C extension, and tests in parallel.
Now let me write all the test files and the C extension in parallel.
Good, all the files are written. Now let me verify everything compiles and the tests pass. Let me first check the pg gem is available, then run the tests.
Let me first drop and recreate the test database cleanly, then run the tests.
All 30 tests pass. Now let me try building the C backend.
Backend compiles. Let me now try building the C extension.
Missing includes in sha1_hash.c. Let me fix it and also check tree_parse.c.
The sha1_hash.c is missing #include "utils/builtins.h" which provides VARSIZE_ANY_EXHDR, VARDATA_ANY, etc. Let me also use the EVP API since the legacy SHA1 functions are deprecated in OpenSSL 3.
Tons of OpenSSL warnings (from the -Wcast-function-type flag in PGXS) but I need to check if it actually linked. Let me look at the end more carefully.
The issue is that PG 16 moved VARDATA macros to varatt.h. Let me fix the includes.
Extension builds successfully (it's .dylib on macOS). Zero errors, just OpenSSL warnings from their headers. Let me now do a quick integration test - install the extension and verify it works.
Owned by andrew but "Operation not permitted" - likely SIP. Let me try with install.
Sandbox restriction on Postgres.app. The C extension builds but can't be installed without disabling SIP protections on the app bundle. That's fine - the extension is a build step 9 optimization item. The core functionality (PL/pgSQL functions + libgit2 backend) all works without it.
Let me now do a proper end-to-end test with the backend binary. Let me first test the import script with a small repo.
Let me do the roundtrip test - create a repo, push it, clone it, diff the working trees.
Now clone it back and diff.
Full roundtrip works. Push to Postgres, clone back, working trees are identical, git history is preserved.
Let me update the task status and also verify the materialized views work.
Everything works. Let me now run the full test suite one more time and then mark tests complete.
Let me also initialize a git repo for the project, now that you mentioned wanting a Makefile (the top-level one already exists).