A/B Testing
Compare variants to measure impact
A/B testing splits users between variants to measure which performs better. Feature flags make this easy - configure the split in Flipswitch, measure results in your analytics.
The Pattern
50% of users -> variant A (control)
50% of users -> variant B (treatment)
Measure: conversion rate, engagement, revenue
Decide: ship B, ship A, or iterateImplementation
1. Create a String Flag
For A/B tests, use a string flag to identify variants:
- Key:
checkout-experiment - Type: String
- Variants:
control- Current checkouttreatment-a- New checkout designtreatment-b- New checkout with upsells
2. Configure the Split
Set up a gradual rollout:
33% -> control
33% -> treatment-a
34% -> treatment-bUsers are assigned deterministically by targetingKey - the same user always sees the same variant.
3. Use in Code
const variant = await client.getStringValue('checkout-experiment', 'control', {
targetingKey: userId
});
// Track which variant for analytics
analytics.track('checkout_started', {
experiment: 'checkout-experiment',
variant: variant,
userId: userId
});
switch (variant) {
case 'treatment-a':
return <NewCheckout />;
case 'treatment-b':
return <NewCheckoutWithUpsells />;
default:
return <CurrentCheckout />;
}MutableContext context = new MutableContext(userId);
String variant = client.getStringValue("checkout-experiment", "control", context);
// Track which variant for analytics
analytics.track("checkout_started", Map.of(
"experiment", "checkout-experiment",
"variant", variant,
"userId", userId
));
switch (variant) {
case "treatment-a":
return newCheckout();
case "treatment-b":
return newCheckoutWithUpsells();
default:
return currentCheckout();
}context = EvaluationContext(
targeting_key=user_id,
)
variant = client.get_string_value("checkout-experiment", "control", context)
# Track which variant for analytics
analytics.track("checkout_started", {
"experiment": "checkout-experiment",
"variant": variant,
"user_id": user_id
})
if variant == "treatment-a":
return new_checkout()
elif variant == "treatment-b":
return new_checkout_with_upsells()
else:
return current_checkout()evalCtx := openfeature.NewEvaluationContext(userID, nil)
variant, _ := client.StringValue(ctx, "checkout-experiment", "control", evalCtx)
// Track which variant for analytics
analytics.Track("checkout_started", map[string]interface{}{
"experiment": "checkout-experiment",
"variant": variant,
"userId": userID,
})
switch variant {
case "treatment-a":
return newCheckout()
case "treatment-b":
return newCheckoutWithUpsells()
default:
return currentCheckout()
}4. Track Conversions
Track outcomes with the variant:
analytics.track('purchase_completed', {
experiment: 'checkout-experiment',
variant: variant,
revenue: order.total,
userId: userId
});analytics.track("purchase_completed", Map.of(
"experiment", "checkout-experiment",
"variant", variant,
"revenue", order.getTotal(),
"userId", userId
));analytics.track("purchase_completed", {
"experiment": "checkout-experiment",
"variant": variant,
"revenue": order.total,
"user_id": user_id
})analytics.Track("purchase_completed", map[string]interface{}{
"experiment": "checkout-experiment",
"variant": variant,
"revenue": order.Total,
"userId": userID,
})5. Analyze Results
In your analytics tool, compare metrics by variant:
| Variant | Users | Conversions | Rate |
|---|---|---|---|
| control | 10,000 | 320 | 3.2% |
| treatment-a | 10,000 | 380 | 3.8% |
| treatment-b | 10,000 | 350 | 3.5% |
Treatment-a wins. Roll it out to 100%.
Statistical Validity
Flipswitch ensures even distribution, but you're responsible for statistical significance:
Sample size. Run the test long enough to collect meaningful data. A 0.5% difference with 100 users means nothing.
Consistent assignment. Users see the same variant across sessions because assignment is based on hashing their targetingKey.
One variable. Change one thing per test. If treatment-a has a new design AND new copy AND new flow, you won't know what worked.
Excluding Users
Sometimes you want to exclude certain users from experiments:
Rule 1: IF segment IS "internal" THEN control
Rule 2: IF segment IS "enterprise" THEN control
Default: 33% control, 33% treatment-a, 34% treatment-bInternal users and enterprise customers always see the stable version.
Ending an Experiment
When you've decided on a winner:
- Set the winning variant to 100%
- Wait for any cached values to expire
- Remove the experiment code
- Delete the flag
// Before cleanup
const variant = await client.getStringValue('checkout-experiment', 'control', {
targetingKey: userId
});
// After cleanup - winner was treatment-a
return <NewCheckout />;// Before cleanup
MutableContext context = new MutableContext(userId);
String variant = client.getStringValue("checkout-experiment", "control", context);
// After cleanup - winner was treatment-a
return newCheckout();# Before cleanup
context = EvaluationContext(targeting_key=user_id)
variant = client.get_string_value("checkout-experiment", "control", context)
# After cleanup - winner was treatment-a
return new_checkout()// Before cleanup
evalCtx := openfeature.NewEvaluationContext(userID, nil)
variant, _ := client.StringValue(ctx, "checkout-experiment", "control", evalCtx)
// After cleanup - winner was treatment-a
return newCheckout()Multi-Armed Bandit
For optimization over pure testing, adjust percentages as you learn:
Week 1: 33% control, 33% treatment-a, 34% treatment-b
Week 2: 20% control, 50% treatment-a, 30% treatment-b
Week 3: 10% control, 70% treatment-a, 20% treatment-bShift traffic toward the winning variant while still exploring alternatives.