Skip to main content

Power & Significance Tooling

Power and significance tooling calculates required sample sizes, expected lift, and guardrail metrics to ensure tests are properly designed and statistically valid.

Overview

Power analysis helps you design tests with adequate sample sizes to detect meaningful differences, while significance testing determines if observed differences are statistically meaningful.

Sample Size Calculation

Calculate required sample sizes:

{
"sample_size_calculation": {
"baseline_conversion_rate": 0.02,
"minimum_detectable_effect": 0.1,
"power": 0.8,
"significance_level": 0.05,
"required_sample_size": 10000
}
}

Expected Lift Estimation

Estimate expected lift:

{
"lift_estimation": {
"baseline_metric": 0.02,
"expected_lift_percentage": 10,
"confidence_interval": [0.08, 0.12]
}
}

Guardrail Metrics

Define guardrail metrics:

{
"guardrail_metrics": [
{
"metric": "zero_results_rate",
"threshold": 0.05,
"action": "stop_test"
},
{
"metric": "revenue_per_user",
"threshold": -0.1,
"action": "alert"
}
]
}

Best Practices

  1. Calculate upfront: Determine sample sizes before starting tests
  2. Set realistic MDE: Choose meaningful minimum detectable effects
  3. Monitor guardrails: Watch for negative impacts during tests
  4. Use proper statistics: Apply correct statistical tests
  5. Account for multiple comparisons: Adjust for multiple variants