Awesome — now test it in two layers:
- command execution inside agent session
- natural-language trigger via skill wrapper
Here’s the quickest path.
1) Command-level test inside a real agent session
- Open a real agent (Codex/Claude/Cursor/etc.) in a project with active trace files.
- Ask the agent to run:
bun run share -- --cwd "$PWD" --agent auto --json
- Validate:
- exit code
0
- JSON has
ok: true
data.sharedUrl exists
- opening URL shows the trace you expect
2) Selector determinism checks
- In that same agent, run:
bun run share -- --trace-id "<known-id>" --json
bun run share -- --source-path "<known-source-path>" --json
bun run share -- --cwd "$PWD" --agent codex --json
- Confirm each selector picks the expected trace.
3) Failure-path checks (important)
- Not logged in:
- clear creds or use a fresh config dir, then run share
- expect
AUTH_REQUIRED
- bad arguments:
- pass multiple selectors
- expect
INVALID_ARGUMENTS
- missing trace:
- fake
--trace-id
- expect
TRACE_NOT_FOUND
4) Skill-trigger test (what you really want)
Right now this only works if the agent has the share skill wrapper installed.
- In the agent chat, type:
share to traces
- Expected wrapper behavior:
- runs canonical command internally
- replies with URL (or mapped error + remediation)
If it does nothing, the wrapper isn’t installed yet for that agent runtime.
5) Per-agent acceptance
For each adapter (codex, opencode, claude-code, cursor, gemini-cli):
- create/update a trace
- run share command from agent context
- run skill phrase
- verify returned URL loads and corresponds to that session
If you want, I can now draft a copy-paste QA checklist markdown file under docs/projects/ for agent-by-agent validation.