Excellent, comprehensive feedback. Here's my analysis with specific code locations and actionable fixes:
Critical Issues
1. Performance: Full Table Scans (api/convex/internal/sync.ts:157-173)
typescript
This collects ALL scheduled functions on every batch upsert. At scale with thousands of scheduled jobs, this is O(n) and unbounded.
Fix options:
- A. Track
lastSummaryScheduledAt on trace record, skip if within cooldown window
- B. Use a separate
trace_summary_jobs table with proper indexes
- C. The job can self-dedupe: store
lastProcessedMessageCount and exit early if unchanged
Recommended (C) - Simplest, no system table dependency:
typescript
2. Single Upserts Never Summarize (api/convex/internal/sync.ts:279)
The single upsertMessage mutation doesn't call scheduleTriggeredJobs, only batch upserts do. This excludes all streaming/real-time traces.
Fix: Add scheduleTriggeredJobs call at line 279 in single upsert path.
3. Message Counting Full Collect (api/convex/internal/sync.ts:126-132)
typescript
This collects all messages to get a count. Use aggregate or store counter.
Fix: Use Convex's aggregation or maintain counter incrementally.
Product/API Issues
4. Title Override Without Metadata (api/convex/namespaces.ts:61,105, api/convex/services/feed.ts:29)
typescript
Downstream can't tell if title is AI-generated or user-provided.
Fix: Return both fields:
typescript
Testing Issues
5. Tests Hit Live OpenAI (api/tests/v1/post-process.test.ts:1-189)
No mocking - slow, requires API key, flaky, costs money.
Fix: Mock requestOpenAiText in tests. Consider:
- Using
msw or similar to intercept
- Dependency injection for AI client
- Environment flag for stub mode
Not Covered (Strategic)
| Concern | Recommendation |
|---|
| Backfill | Add manual runAllSummaries() admin endpoint + CLI command |
| Cost at scale | Track tokens per summary, add daily budget cap, expose metrics |
| User control | Add aiSummaryEnabled boolean on trace/namespace |
| Privacy | Add excludeFromAiProcessing flag, document data flow |
| Observability | Log failures to trace_summary_logs table with error context |
| Stale summaries | Store summaryGeneratedAt, show age in UI, allow manual refresh |
Priority Action Items
- P0 - Fix single upsert exclusion (add
scheduleTriggeredJobs call)
- P0 - Add job self-dedupe using
lastSummaryMessageCount field
- P1 - Mock OpenAI in tests
- P1 - Add
ai_title alongside title in API responses
- P2 - Add observability table for summary failures
- P2 - Add user opt-out flag
Want me to implement any of these fixes?