A single execution of an experiment plan, the actual conditions, observations, and raw results.
An experiment run is a single execution of an experiment planExperiment PlanValidationAn experiment plan describing the hypothesis, setup, success criteria, and methodology before a test runs.View reference →: one A/B test that actually shipped to traffic, one fake-door that ran for a fortnight, one cohortCohortGrowthA group of users sharing a common characteristicView reference → you followed for thirty days. The plan is the design; the run is the instance that produced data on specific dates with specific users. Keeping the two apart is what stops a team rerunning a test until the numbers flatter them.
The discipline of running product experimentsExperimentValidationA test designed to validate a hypothesisView reference → to settle questions comes from the lean and growth traditions. Eric Ries built *The Lean Startup* (2011) around the Build-Measure-Learn loop and the idea of validated learningLearningValidationAn insight gained from an experimentView reference →, where a value hypothesisHypothesisValidationA testable belief about a solutionView reference → and a growth hypothesis are tested against real customer behaviour, not asserted from the desk. A run is the "measure" leg of that loop made concrete: the loop turns once per run.
The statistical machinery is older and arrived with a warning attached. Online controlled experiments inherited the frequentist test, and with it the peeking problem. Evan Miller's widely-read How Not To Run an A/B Test showed that checking results repeatedly and stopping the moment significance appears inflates the false-positive rate far beyond the nominal five per cent. Ramesh Johari and colleagues formalised the fix in the 2017 KDD paper Peeking at A/B Tests, introducing always-valid p-values and sequential tests that stay honest under continuous monitoring. The lesson reshaped how a run is defined: a run has a pre-registered sample size or a sequential stopping rule, and analysing it is part of the run, not a separate liberty the analyst takes whenever the dashboardDashboardData & AnalyticsAn analytics dashboardView reference → looks good.
Where the field landed is a clean separation of three things. The plan states the hypothesis, the metricMetricStrategyA unified metric that measures progress, health, or behaviour across the productView reference →, the minimum detectable effect, and the stopping rule. The run executes that plan once. The result is what the run yields, kept distinct so that a failed run still counts as evidenceEvidenceValidationData supporting or refuting a hypothesisView reference → and a single plan can be run more than once without anyone pretending the reruns were one test.
A team believes a shorter signup form will lift completion. The experiment plan fixes the success metric at signup-completion rate, sets a minimum detectable effect of two percentage points, and a sample size of 40,000 visitors per arm to reach 80 per cent power. That is the design, and it does not move.
The run starts on the first of the month and splits live traffic 50/50. Eleven days in, the variantVariantGrowthA variant in an A/B testView reference → is up 3.1 points and someone wants to call it. The pre-committed rule says wait for 40,000 per arm; with a sequential design, the always-valid boundary has not been crossed. They hold. By day eighteen the lift settles at 1.4 points, below the threshold that justified the work. The run produced a clear result, just not the hoped-for one, and because the stopping rule was set in the plan, nobody can argue the team peeked their way to a phantom win.
In the Unified Product Graph, Experiment RunValidationAn execution instance of an experiment that records actual conditions, observations, and raw results. sits in the validation region as the instance node between design and conclusion. experiment_runExperiment Planran asExperiment Runhierarchy records that a run is one execution of a plan, which makes reruns first-class and countable. experiment_plan_ran_as_experiment_runExperiment RunvalidatesHypothesiscausal ties the run to the belief under test, experiment_run_validates_hypothesisExperiment RunyieldsEvidencecausal captures the raw measured outcomeOutcomeStrategyA desired business or user outcomeView reference →, and experiment_run_yields_evidenceExperiment RunproducesLearningcausal records what the team concluded. Separating evidence from learning matters: it preserves the audit trail when a later run contradicts an earlier one, and it makes the peeking problem structurally visible, because a hypothesis "validated" by a single run with no pre-registered plan is queryably weak.experiment_run_produces_learning
Type-specific fields on BaseNode
actual_start_datestringISO actual start date (may differ from the plan's `planned_start_date`)
actual_end_datestringISO actual end date
actual_reachnumberObserved reach: how many people the run actually touched
outcome_summarystringPlain-English outcome
severity_of_findingobjectSeverity / strength of the finding (UPGAssessment)
learningstringWhat the team learned (rich text)
dispositionstringResolution against the parent plan's success criteria. `confirmed` = evidence supports the parent hypothesis_claim. `disconfirmed` = evidence refutes the parent hypothesis_claim. `inconclusive` = data insufficient or noisy. `aborted` = run terminated early.
idstringrequiredUnique identifier (UUID)
typeNodeTyperequiredDiscriminator for the entity type
titlestringrequiredDisplay name
descriptionstringOptional detailed description
statusstringLifecycle status
tagsstring[]Freeform tags for filtering
3 phases — initial: in_progress
20 edge types connected to this entity.
experiment_executed_as_experiment_runexperiment_plan_ran_as_experiment_runtest_plan_ran_as_experiment_runexperiment_run_tested_via_experiment_rundashboard_contains_experiment_runexperiment_run_produces_learningexperiment_run_yields_evidenceexperiment_run_measures_metricexperiment_run_tested_via_experiment_runexperiment_run_validates_hypothesisexperiment_run_produced_insight_insightexperiment_run_informed_decision_decisionbeta_program_runs_experiment_runexperiment_run_tests_variantcohort_exposed_to_experiment_runexperiment_run_tests_pricing_tierexperiment_run_tests_featureexperiment_run_guards_metricexperiment_run_measured_by_metricexperiment_run_measures_outcome2 frameworks use this entity type.