Understand architecture and environment.
PlanWrite script to trigger compaction with large context.
BuildRun script and adjust compaction thresholds.
VerifyAnalyze compaction report and validate behavior.
VerifyExplain user access to uncompacted context data.
Understandcan you generate a script to call ask with a big context to trigger compaction? I want to inspect what is sent to the LLM and validate if the uncompacted session data is still available to a user. run this script and analyse to verify
Now I understand the architecture. Let me check package.json and the env:
Now I have everything I need. Let me write the script:
Now let's run it:
Compaction didn't fire — the repo/responses were too small. Let me tighten the threshold further:
Compaction fired. Let me analyze the report in more detail to verify the two key claims:
The run validates the behavior. Here's the write-up:
Script: scripts/inspect-compaction.ts — constructs a Session directly (the public Client.connect doesn't expose stream injection), wraps pi-ai's stream / streamSimple so every Context.messages snapshot sent to the LLM is captured in memory, and uses aggressive compaction settings (contextWindow: 2500, reserveTokens: 500, keepRecentTokens: 600) against jonschlinkert/is-odd with Claude Haiku 4.5.
Full dump: tmp/compaction-report.json. Log: tmp/inspect-compaction.log.
Four asks ran; compaction fired at the start of turn 3 (before the first inference of that turn), triggered by maybeCompact seeing 2367 tokens > 2000 (=contextWindow − reserve).
LLM-facing context shrank abruptly across the compaction boundary:
| call | msgs sent to LLM | first message |
|---|---|---|
| #0 | 1 | user prompt of turn 1 |
| #1 | 4 | user prompt of turn 1 |
| #2 | 6 | user prompt of turn 1 |
| #3 | 8 | user prompt of turn 1 |
| #4 | 11 | user prompt of turn 1 (peak pre-compaction) |
| #5 | 3 | [CONTEXT SUMMARY - Previous conversation was compacted]… |
| #6 | 6 | summary wrapper |
| #7 | 8 | summary wrapper |
Post-compaction call #5 contains exactly:
[0] user: synthetic summary-wrapper message[1] assistant: the last assistant text from turn 2 (kept because keepRecentTokens)[2] user: the new prompt that started turn 3The raw contents of README.md, package.json, fd results, index.js, test.js — gone from the LLM's view. Only the LLM-generated summary survives.
Yes, fully. session.getTurns() continues to expose every compacted turn with complete step fidelity:
turn 0 (compacted): tool_call read(package.json) -> 4171 bytes output ✔
tool_call read(README.md) -> 1871 bytes output ✔
turn 1 (compacted): tool_call fd(...) -> 17 bytes output ✔
tool_call read(index.js) -> 1319 bytes output ✔
tool_call read(test.js) -> 736 bytes output ✔
Invariant check at the end of the script confirmed: every tool_call step in every compacted turn still carries its full output string.
The session keeps two separate stores (src/session.ts):
#context.messages — the Context passed to pi-ai. maybeCompact replaces this with [summaryWrapper, ...keptMessages].#turns: TurnResult[] — built by AskStreamImpl at turn_end and pushed in the ask() callback. Compaction never touches it.Compaction therefore only edits the LLM-facing store; the user-facing audit log (getTurns()) plus the summary (getCompactionSummary()) together give callers both the compact representation used by the model and the full pre-compaction history, and the two are guaranteed independent by construction.
scripts/inspect-compaction.ts — new reproducible harness (you can rerun any time with a different repo as first arg)tmp/compaction-report.json — full captured llmCalls[] + turns[] + summarytmp/inspect-compaction.log — narrated run outputcan the user see the uncompacted context at any point
Short answer: yes — the uncompacted conversation is always retrievable by the user, just not as the exact #context.messages array that pi-ai sees.
Here's the precise picture, grounded in the code:
| Store | What it is | Mutated by compaction? | Public access? |
|---|---|---|---|
Session.#context.messages | The live pi-ai Context.messages — the literal array sent to the LLM | Yes — replaced with [summaryWrapper, ...kept] by maybeCompact | ❌ private, no getter |
Session.#turns: TurnResult[] | Append-only audit log, one entry per completed turn | No — never touched by compaction | ✅ session.getTurns() |
Session.#compactionSummary | The latest LLM-generated summary text | Set by compaction | ✅ session.getCompactionSummary() |
1. During the turn that produced it — always, via stream events.
Compaction only runs in #runCompaction, which is called at the start of each new turn, on [...this.#context.messages, newQuestionMessage] (src/session.ts #doAsk). A turn's own content is therefore never compacted before its turn_end fires — every text, tool_use_end, tool_result event reaches the caller raw.
2. After the turn ends — always, via session.getTurns().
AskStreamImpl's onTurn callback pushes the completed TurnResult onto #turns before any future compaction can run. TurnResult.steps preserves full fidelity: tool_call.output is the entire tool result string, text.text is the full assistant message. The previous run's report confirmed this — compacted turn 0 still carried the full 4171-byte package.json read output and the 1871-byte README.md read output inside its steps.
3. After compaction — still yes, via the same getTurns().
Compaction is a view-only optimization for what goes to the LLM. It never deletes history from the turn log.
The exact pi-ai Message[] array that was previously going to the LLM. #context.messages is private and there's no getter. The closest thing is src/turns-to-messages.ts::reconstructContext(turns), which rebuilds an equivalent Message[] from turns — but it's not re-exported from src/index.ts / the JSR . export, so it isn't part of the public surface. Users wanting the literal message array either:
reconstructContext themselves from getTurns() output (all the information is there), orstream / streamSimple like our script does (requires bypassing Client.connect and constructing Session directly, since Client.connect doesn't expose stream injection).If exposing uncompacted-context-as-messages is an intended use case (e.g. for "fork this session into another tool" or "inspect what the LLM would see without compaction"), consider one of:
reconstructContext from src/index.ts, orsession.getFullMessages() method that returns reconstructContext(this.#turns).messages, orsession.getLLMContextMessages() getter that returns a defensive copy of #context.messages (the compacted view — useful for debugging what the next call will actually send).Right now callers get (1) and (2) by proxy through getTurns() + a bit of client-side glue, but (3) — the actual post-compaction payload — is only observable by intercepting stream, which the public Client API doesn't support.