how deterministic is the way we're deciding to not show the @frontend/components/HomeOnboardingWizard.tsx ? where can it break? how can we make it bulletproof?
It’s mostly deterministic but not bulletproof. The wizard shows unless all steps are “complete,” and completion is a mix of sync session data + async network state, so it can flicker or get stuck.
Here’s the current logic in frontend/components/HomeOnboardingWizard.tsx:
loaded && session is true; otherwise it returns null.completedCount === resolvedSteps.length.hasProfile: true (hardcoded, so this step never blocks).hasDeviceIdentity: async poll to /v1/identities gated on NEXT_PUBLIC_CONVEX_HTTP_URL and sessionUserId. Defaults to null (loading) or false on errors/missing env.hasSharedTrace: derived from getAllTraces or getPublicTraces once queries resolve; null while both are undefined.Where it can break or be nondeterministic:
NEXT_PUBLIC_CONVEX_HTTP_URL is missing, hasDeviceIdentity becomes false immediately. That makes the wizard never complete even if the user has device identities./v1/identities failures are treated as “no device identity,” so the wizard can show forever under transient failures.hasSharedTrace stays null until at least one query resolves. If either query never resolves (auth issues, skip conditions), it stays loading and the wizard never completes.identitySnapshot is cached per userId but updates every 15s; if the user logs in/out quickly, there’s a short window where it can show stale state.How to make it bulletproof (directional changes):
Move completion to a single backend source of truth.
Add a Convex query like getOnboardingStatus that returns { hasProfile, hasDeviceIdentity, hasSharedTrace } from authoritative data. One query, one result, no polling, no env dependency.
Treat errors as “unknown” not “false.”
Right now errors turn into “incomplete,” which can show the wizard forever. Return null on error and decide on a fallback (e.g., show a retry state or skip the step after a timeout).
Persist completion locally.
Once completed, set a durable flag (Convex or local storage) so the wizard won’t reappear due to transient backend unavailability.
Remove polling fetch /v1/identities.
Use Convex for device identity, or subscribe to a real-time query so it’s updated deterministically without a timer.
Define a consistent loading policy.
E.g. “do not show the wizard until all steps are resolved (not null)” vs. “show with loading states.” Right now it can show with loading steps, which can look flaky.
If you want, I can propose a minimal change path (e.g., add a getOnboardingStatus query and update this component to depend on that single result).