-
Notifications
You must be signed in to change notification settings - Fork 30
🤖 feat: Add PostHog experiments integration #1179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
31b073d to
b9f4947
Compare
|
@codex review |
|
@codex review |
1dd8596 to
eacd94f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@codex review |
|
Codex Review: Didn't find any major issues. Can't wait for the next one! ℹ️ About Codex in GitHubCodex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback". |
Add backend-first PostHog feature flag evaluation for remote-controlled experiments, starting with Post-Compaction Context. Backend (ExperimentsService): - Evaluate PostHog feature flags via posthog-node - Disk cache (~/.mux/feature_flags.json) with TTL-based refresh - Fail-closed behavior (unknown = disabled) - Disable calls when telemetry is off Telemetry enrichment (TelemetryService): - setFeatureFlagVariant() adds $feature/<flagKey> to all events - Enables variant breakdown in PostHog analytics oRPC layer: - experiments.getAll: Get all experiment values - experiments.reload: Force refresh from PostHog Frontend (ExperimentsContext): - Fetch remote experiments on mount - Priority: remote PostHog > local toggle > default - Read-only UI when experiment is remote-controlled Backend authoritative gating (WorkspaceService): - sendMessage() resolves experiment from PostHog when enabled - list() decides includePostCompaction based on experiment Type consolidation: - ExperimentValueSchema (Zod) is single source of truth - ExperimentValue type derived via z.infer in types.ts Bug fixes (unrelated): - Fixed backgroundProcessManager exit race condition - Fixed telemetry client Node.js compatibility - Relaxed timing test threshold in authMiddleware Change-Id: I346c924324a5f59cb3349614382dc8a5276e5e1e Signed-off-by: Thomas Kosiewski <[email protected]>
Allow per-experiment control over whether users can override remote PostHog assignments and whether experiments appear in Settings. ExperimentDefinition changes: - userOverridable?: boolean - when true, local toggle takes precedence - showInSettings?: boolean - when false, hide from Settings UI Resolution priority (when userOverridable=true): 1. Local localStorage toggle (user's explicit choice) 2. Remote PostHog assignment 3. Default (enabledByDefault) Implementation: - experiments.ts: Add new optional fields, set POST_COMPACTION_CONTEXT as userOverridable=true - ExperimentsContext: hasLocalOverride() helper, updated useExperimentValue() and useAllExperiments() to respect userOverridable - ExperimentsSection: Filter by showInSettings, enable toggle when canOverride - WorkspaceService: Respect userOverridable in both list() and sendMessage() Change-Id: I3afc8514c74151b8b72991aa13ab98296cfd19bb Signed-off-by: Thomas Kosiewski <[email protected]>
- Makefile: MUX_ENABLE_TELEMETRY_IN_DEV=1 now sufficient (no need to also set MUX_DISABLE_TELEMETRY=0) - ExperimentsSection: hide non-overridable experiments, remove PostHog info line - Add experiment_overridden telemetry event with experimentId, assignedVariant, userChoice - Update oRPC schema, payload types, tracking functions, and useTelemetry hook Signed-off-by: Thomas Kosiewski <[email protected]> --- _Generated with `mux` • Model: `anthropic:claude-opus-4-5` • Thinking: `high`_ Change-Id: I3582117b82c1025bcfd94d1361bb11c46cb8ff9e
Previously, getSendOptionsFromStorage() (used by resume/creation flows) always passed the localStorage default to the backend, which treated any non-undefined value as an explicit user override for userOverridable experiments. Fix: isExperimentEnabled() now returns undefined for userOverridable experiments when user hasn't explicitly set a localStorage value. The backend already handles undefined correctly by falling through to PostHog assignment. Also addresses code review feedback: - Move telemetryService property declaration to top of ExperimentsService class - Add comment explaining the "undefined" string check in hasLocalOverride - Add docstring for getRemoteExperimentEnabled and move near other helpers - Document MUX_ENABLE_TELEMETRY_IN_DEV env var in Makefile Change-Id: I15e5360f7461cd62ad347fbb2e32b0e8dc4b873d Signed-off-by: Thomas Kosiewski <[email protected]>
- useSendMessageOptions now passes undefined unless user explicitly overrides - add useExperimentOverrideValue() helper - harden localStorage parsing + add tests Change-Id: I33d9f1c8d3bba4f083132c4f645c76328327726a Signed-off-by: Thomas Kosiewski <[email protected]>
ExperimentsService can return { source: 'cache', value: null } on first launch while
PostHog refresh runs in the background.
The renderer previously fetched experiments.getAll only once, so remote variants
never became visible until manual reload.
Fix:
- Poll experiments.getAll with bounded backoff while any values are pending
- Add a regression test for ExperimentsProvider
Change-Id: If9533ee2ad0430729600275aedcf9b1939ec612d
Signed-off-by: Thomas Kosiewski <[email protected]>
0c6b512 to
e171a6e
Compare
Add backend-first PostHog feature flag evaluation for remote-controlled experiments, starting with Post-Compaction Context.
Changes
Backend (ExperimentsService)
posthog-node~/.mux/feature_flags.json) with TTL-based refreshTelemetry enrichment (TelemetryService)
setFeatureFlagVariant()adds$feature/<flagKey>to all eventsoRPC layer
experiments.getAll: Get all experiment valuesexperiments.reload: Force refresh from PostHogFrontend (ExperimentsContext)
Backend authoritative gating (WorkspaceService)
sendMessage()resolves experiment from PostHog when enabledlist()decidesincludePostCompactionbased on experimentType consolidation
ExperimentValueSchema(Zod) is single source of truthExperimentValuetype derived viaz.inferin types.tsBug fixes (unrelated)
📋 Implementation Plan
PostHog early access, feature flags, and experiments (Mux)
Goals
Post-Compaction Context).Recommendation (architecture)
✅ Approach A (recommended): Backend-owned flag/experiment evaluation + oRPC exposure
Net new product LoC (est.): ~250–450
posthog-nodeclient to:$feature_flag_calledexposure events$feature/<flagKey>properties to telemetry events so PostHog can break down metrics by variantWhy this fits Mux:
Post-Compaction Contextexperiment gates backend behavior (attachment injection), so backend must know the assignment anyway.Alternative B: Renderer uses
posthog-jsfor flags/experiments (keep backend telemetry)Net new product LoC (est.): ~350–650
Pros:
Cons:
Proposed flow (Approach A)
Implementation plan
1) Backend: add a PostHog-backed experiments/flags service
src/node/services/experimentsService.ts(name TBD) that depends on:TelemetryService(fordistinctId+ access to aPostHogclient), orPostHogClientServiceif you want to refactor TelemetryService into a reusable PostHog wrapper.Core responsibilities:
getDistinctId()(or expose from TelemetryService) – single stable identity used for both:getExperimentVariant(experimentId: ExperimentId): Promise<string | boolean | null>ExperimentId→ PostHog feature flag key. (Conveniently, currentEXPERIMENT_IDS.*already look like flag keys.)posthog.getFeatureFlag(flagKey, distinctId).$feature_flag_called(exposure) events when appropriate.isExperimentEnabled(experimentId: ExperimentId): booleanpost-compaction-context:"test"/true→ enabled"control"/false/null→ disabledOffline + startup behavior:
~/.mux/feature_flags.json(or inside muxHome neartelemetry_id).Feature-flag enablement rules:
MUX_DISABLE_TELEMETRY=1, CI, test, etc.), do not call PostHog for flags.null/ “unknown” from the service.MUX_ENABLE_TELEMETRY_IN_DEV=12) Backend: attach experiment/flag info to telemetry events
PostHog’s docs explicitly require this for server-side capture.
Implement one of these (recommend #1):
Manual property injection (preferred):
$feature/<flagKey>properties to captured events.TelemetryService.getBaseProperties()merges in a stablethis.featureFlagPropertiesmap populated byExperimentsService.$feature/post-compaction-context: 'control' | 'test'(or boolean) depending on how you configure variants.sendFeatureFlags: trueonposthog.capture()Also add:
3) oRPC: expose experiment state to the renderer
Add a new oRPC namespace, e.g.
experiments:experiments.getAll→ returnsRecord<ExperimentId, { value: string | boolean | null; source: 'posthog' | 'cache' | 'disabled' }>experiments.reload(optional) → forces a refresh, useful for debugging the Settings page.Update:
src/common/orpc/schemas/api.tsto include the new endpoints.src/node/orpc/router.tsto wire handlers.src/node/orpc/context.ts+ServiceContainerto register the new service.4) Frontend: update ExperimentsContext + Settings → Experiments
Target behavior:
Concrete steps:
useRemoteExperiments()→ fetchesexperiments.getAllonce and stores in context.useExperimentValue(EXPERIMENT_IDS.POST_COMPACTION_CONTEXT)to resolve in this order:enabledByDefault)UI changes (
ExperimentsSection.tsx):Switch(or replace with a badge) and showVariant: control/test.5) Wire
Post-Compaction Contextgating to PostHogBackend gating (authoritative):
AgentSession(orWorkspaceService.sendMessage), compute:postCompactionContextEnabled = experimentsService.isEnabled('post-compaction-context')options?.experiments?.postCompactionContext.Frontend gating (UI):
includePostCompactioninworkspace.listRecommended (practical) simplification:
workspace.list({ includePostCompaction })so that whenincludePostCompactionis omitted, the backend decides based on experiment state.WorkspaceProviderloads metadata beforeExperimentsProvidermounts today.6) Add minimal analytics events for the experiment (optional but high-value)
To get actionable insights beyond “did users click it,” add 1–2 low-cardinality events:
compaction_performedhad_file_diffs: boolean,diff_count_b2: numberpost_compaction_context_injectedplan_included: boolean,diff_count_b2: numberAll properties must remain privacy-safe (counts + booleans only).
7) Tests
ExperimentsService:TelemetryService:$feature/post-compaction-contextwhen cached/availablePostHog provisioning (via MCP) ✅
Since you’ve configured the PostHog MCP server, we can create the flag + experiment as part of this integration (in Exec mode) rather than doing it manually in the PostHog UI.
Select the target PostHog project
posthog_organization-details-get(confirm current org)posthog_projects-get(pickprojectId)posthog_switch-project({ projectId })(if needed)Create (or reuse) the feature flag
post-compaction-contextposthog_feature-flag-get-all(search for key)posthog_feature-flag-get-definition({ flagKey: 'post-compaction-context' })posthog_experiment-createcreate/update the underlying flag (since experiments want explicit variants).posthog_create-feature-flag(...)(boolean-only) and then upgrade to variants via the experiment.Create (or reuse) the experiment “Post-Compaction Context”
posthog_experiment-get-all(avoid duplicates for the samefeature_flag_key)posthog_event-definitions-list(look forerror_occurred,stream_completed,message_sent, etc.)posthog_experiment-create({ feature_flag_key: 'post-compaction-context', variants: [{ key: 'control', rollout_percentage: 50 }, { key: 'test', rollout_percentage: 50 }], ... })error_occurredstream_completed, meanmessage_sentpost_compaction_context_injectedas a secondary metric (sanity-check feature usage).Launch / stop the experiment
posthog_experiment-update({ experimentId, data: { launch: true } })posthog_experiment-update({ experimentId, data: { conclude: 'won' | 'lost' | 'inconclusive' | 'stopped_early' | 'invalid', conclusion_comment } })Manual fallback (if MCP is unavailable)
post-compaction-context.controlvstest).error_occurred,stream_completed, orpost_compaction_context_injectedif implemented).Notes on Early Access Feature Management
PostHog’s docs state Early Access management is currently only available in the JavaScript Web SDK.
If we want “users opt into betas” inside Mux Settings:
posthog-jsin the renderer specifically for early access APIs, ORGiven your immediate goal is AB testing
Post-Compaction Context, I’d start with backend feature flags/experiments first.Generated with
mux• Model:anthropic:claude-opus-4-5• Thinking:high