The design of a test before it runs, its hypothesis, setup, methodology, and success criteria.
An experiment plan is the designed test written down before any data exists: the hypothesisHypothesisValidationA testable belief about a solutionView reference →, the method, the sample, the success metricMetricStrategyA unified metric that measures progress, health, or behaviour across the productView reference →, and the rule that says when to stop. Its whole value comes from being fixed in advance. The moment you decide what counts as success after seeing the numbers, the test stops telling you anything you can trust.
The idea that a test must be planned before it runs is old, but its modern formalisation comes from the replication crisis in psychology and biomedicine. Researchers noticed that statistically significant results kept failing to reproduce, and a chief culprit was p-hacking: trying many analyses, then reporting only the one that crossed the significance threshold. A close relative, HARKing, hypothesising after the results are known, dressed up post-hoc patterns as predictions.
Pre-registration is the corrective. You write the plan, including the analysis, and lodge it with a date stamp before collecting data. Brian Nosek and colleagues at the Center for Open Science built the Open Science Framework to host these plans, and argued the case directly in The preregistration revolution (PNAS, 2018). Their core claim is that pre-registration sharpens the line between hypothesis generation and hypothesis testing, so a reader can tell which results were predicted and which were discovered after the fact.
A useful debate followed. Alison Ledgerwood pointed out that a pre-registration mixes two separable things: the prediction and the analysis plan. Confirming a registered prediction with a registered analysis is the strong case; everything else is exploration, which is valuable and honest as long as it is labelled as such. Product experimentation inherited this discipline wholesale. An A/B test with a metric and a stop rule chosen up front resists the same biases that pre-registration was built to defeat.
A growth team believes a shorter signup form will lift completions. The experiment plan states the hypothesis precisely: cutting the form from nine fields to four raises signup completion from a baseline of 38% by at least three percentage points. It fixes the method (a 50/50 split on new visitors), the primary metric (completion rate), and a power calculation that says roughly 14,000 visitors per arm are needed to detect that effect. It names a stop rule: run for two full weeks, no peeking-and-stopping when the line looks good. It also pre-commits a guardrail metric, downstream activation, so a cheap completion win that produces worse users gets caught. With all of that written before launch, the result is interpretable whichever way it lands.
In the Unified Product Graph, Experiment PlanValidationAn experiment plan describing the hypothesis, setup, success criteria, and methodology before a test runs. sits in the validation region as the bridge between belief and evidenceEvidenceValidationData supporting or refuting a hypothesisView reference →. A hypothesis reaches for it through experiment_planHypothesisrequiresExperiment Plancausal, the plan points at its yardstick through hypothesis_requires_experiment_planExperiment PlantargetsMetriccross-domain, and execution is recorded as a distinct node linked by experiment_plan_targets_metricExperiment Planran asExperiment Runhierarchy. Growth work connects through experiment_plan_ran_as_experiment_runGrowth Campaigntests viaExperiment Planhierarchy. Keeping plan, run, and result as separate connected nodes is the structural version of pre-registration: the design is committed and queryable before any outcomeOutcomeStrategyA desired business or user outcomeView reference → is attached to it, so nobody can quietly rewrite the test to fit the answer.growth_campaign_tests_via_experiment_plan
Type-specific fields on BaseNode
methodstringExperimental method. Drives renderer and analysis tooling.
success_criteriastringPlain-English description of "passing"
projected_reachobjectProjected reach: how many people the run is expected to touch (UPGAssessment)
projected_impactobjectProjected impact on the target metric (UPGAssessment)
confidenceobjectTeam confidence at plan-time (UPGAssessment, scale `confidence_5pt`)
cost_estimateobjectCost estimate at plan-time (UPGAssessment)
planned_start_datestringPlanned start date
planned_end_datestringPlanned end date
idstringrequiredUnique identifier (UUID)
typeNodeTyperequiredDiscriminator for the entity type
titlestringrequiredDisplay name
descriptionstringOptional detailed description
statusstringLifecycle status
tagsstring[]Freeform tags for filtering
4 phases — initial: drafted
7 edge types connected to this entity.
hypothesis_requires_experiment_planexperiment_has_plangrowth_campaign_tests_via_experiment_planpricing_strategy_tests_experiment_planexperiment_plan_ran_as_experiment_runexperiment_plan_targets_metricexperiment_plan_targets_behavioral_segment