Explained Honcho data model and inference needsPublic

Agent

Codex

Model

GPT-5.5

Shared

May 17, 8:10 PM

Messages91

Downloads4

Workspace

Repo

plastic-labs/honcho

Summary

In Progress

Plan
Reviewed Honcho's ORM and agent pipeline.
Plan
Plan
Mapped database tables to inference invocation points.
Plan
Plan
Described core pairwise memory system and key models.
Plan

Explain Honcho's data model from the ground up. I notice that it makes inference requests for a number of different facets. What does it need inference for?

I’ll trace this from the actual ORM and agent pipeline rather than only the overview text, then map the tables to where inference is invoked.

The schema is centered on peer-to-peer observations, not just users and chats. I’m narrowing in on the persisted models and the code paths that turn raw messages into conclusions, cards, summaries, and dialectic responses.

Honcho’s core model is a pairwise memory system layered on top of ordinary chat data.

At the bottom:

Workspace: tenant/container for everything.
Peer: any participant, human or agent. Peers are unique by (workspace_name, name).
Session: a conversation context inside a workspace.
session_peers: many-to-many membership table, with per-peer session config like observe_me / observe_others.
Message: raw utterance from a peer in a session, ordered by seq_in_session.
MessageEmbedding: vector-search rows for raw messages, used when agents need to search original conversation text.

See src/models.py, src/models.py, src/models.py, src/models.py, src/models.py, and src/models.py.

The memory layer is where Honcho becomes unusual:

Collection: one memory collection per (workspace, observer, observed).
Document: one persisted observation/conclusion inside that pairwise collection.
Document.level: explicit, deductive, inductive, or contradiction.
Document.source_ids: links higher-level observations back to source observations, giving Honcho a traversable reasoning graph.
Document.embedding: vector index for semantic memory search.

That means Alice’s model of Bob is separate from Bob’s model of Bob, and separate from Carol’s model of Bob. The schema enforces that with the observer, observed, and workspace_name keys on collections and documents; see src/models.py and src/models.py.

A Representation is the in-memory view over those documents. It is not a table. It groups observations into four levels:

explicit: facts directly stated.
deductive: logical implications.
inductive: patterns/generalizations.
contradiction: conflicting claims.

That shape is defined in src/utils/representation.py.

Message Flow When a message is created, Honcho stores the raw message, optionally embeds it, then enqueues background work. Message creation creates MessageEmbedding rows if EMBED_MESSAGES is enabled; see src/crud/message.py.

Then enqueueing decides what inference tasks are needed:

create summaries on configured message intervals,
create representation tasks if reasoning is enabled,
decide who observes whom based on observe_me and observe_others.

That logic is in src/deriver/enqueue.py. A message from peer observed may produce a self-observation collection (observed -> observed) and also other peer-perspective collections like (assistant -> observed) if session config says those peers observe others.

What Inference Is For Honcho uses inference for several distinct jobs because each job produces a different kind of artifact.

Explicit observation extraction

Raw messages are unstructured. The deriver asks an LLM to turn recent messages into structured explicit facts. It uses a structured PromptRepresentation whose only field is explicit, then saves those observations into every relevant observer/observed collection. See src/deriver/deriver.py and src/deriver/deriver.py.

This is inference, but constrained: “facts literally stated,” not broader reasoning.

Embeddings for retrieval

Honcho embeds messages and observations so agents can do semantic search over raw conversation and memory. This is not “reasoning” in the same sense as the LLM deriver, but it is still model inference. Observation embeddings are created when representations are saved; see src/crud/representation.py.

Session summaries

Summaries are stored in Session.internal_metadata, not as first-class rows. Honcho creates short and long rolling summaries to keep session context bounded while preserving key facts. The prompts explicitly ask for factual summaries, preferences, context, and, for long summaries, emotional state/personality themes. See src/utils/summarizer.py and src/utils/summarizer.py.

Dreaming / consolidation

The Dreamer runs later over accumulated observations. It has two specialist agents:

deduction: logical implications, updates, contradictions, outdated-observation deletion, peer-card updates;
induction: recurring patterns, preferences, traits, behavioral tendencies.

The orchestrator runs deduction then induction; see src/dreamer/orchestrator.py. The specialist prompts make the facets explicit in src/dreamer/specialists.py and src/dreamer/specialists.py.

Peer cards

A peer card is a compact durable profile stored on the observing peer’s internal_metadata, keyed by the observed peer. For self-observation it uses peer_card; for another peer it uses {observed}_peer_card. See src/crud/peer_card.py and src/crud/peer_card.py.

This is another “facet” because it is not the full memory graph. It is a small, durable profile cache: name, preferences, stable traits, standing instructions.

Dialectic answers

The Dialectic API is inference at query time. It prefetches semantically relevant explicit and derived observations, injects recent session history, then lets an LLM use tools to search memory/messages and synthesize an answer. See src/dialectic/core.py and src/dialectic/core.py.

So the short answer is: Honcho needs inference because its product is not “chat storage.” It is trying to maintain a layered, perspective-specific model of people. The different inference requests correspond to different compression/abstraction layers: raw messages -> explicit observations -> deductions/contradictions -> inductive patterns -> peer cards -> query-time dialectic answers.

I would like to visualize this

Here are two useful views.

Data Model

Inference Flow

Mental Model

The key visual idea: Message is the event log, Document is the memory graph, and (observer, observed) is the lens that keeps one peer’s model of another peer separate from everyone else’s.