Data handling terms and policies
A data contract is the explicit agreement on schema, semantics, and service-level expectations between the team that produces a dataset and the teams that consume it. The interesting part is what it changes about ownership: an upstream engineer can no longer rename a column on a whim, because the contract makes that column a published interface with consumers depending on it.
The pattern grew out of a practical pain. At GoCardless, Andrew Jones found that upstream schema changes kept breaking downstream pipelines without warning, and around 2022 he began describing the fix as a "data contract": treat the dataset like an API, codify the schema and ownership and service-level objectivesObjectiveStrategyA strategic goal (OKR)View reference → up front, and deliver pre-modelled data into the warehouse. He set the approach out in full in *Driving Data Quality with Data Contracts* (Packt, 2023). The idea spread quickly and became one of the most discussed topics in data engineering.
A payments team owns a transactions event stream. The analytics team depends on three fields from it: amount_cents, currency, and settled_at. A data contract records those fields, their types, a freshness target (events land within five minutes), and the owning team. When a backend engineer later proposes renaming settled_at to completed_at, the contract test fails in CI before the change merges, naming the analytics models that would break. The rename still happens, on a schedule both teams agree, with a deprecation window. The breakage moves from a 2am dashboardDashboardData & AnalyticsAn analytics dashboardView reference → outage to a code review comment.
currency is not null). A data contract is the broader agreement that may contain many such rules alongside schema, ownership, and SLAs.In the Unified Product Graph, a data contract lives in the compliance region, which is the right home because it is fundamentally about an enforceable promise. A product is bound to it (Productbound byData Contracthierarchy), a compliance frameworkCompliance FrameworkComplianceA compliance framework (SOC 2, GDPR, etc.)View reference → can govern it (product_bound_by_data_contractCompliance FrameworkgovernsData Contracthierarchy), and the contract in turn governs the underlying source (compliance_framework_governs_data_contractData ContractgovernsData Sourcecross-domain). That last edge is what makes the structure useful: trace from a contract to its data_contract_governs_data_sourceData SourceData & AnalyticsA data source or integrationView reference → and you see exactly which system is obliged to honour the promise.data_source
Type-specific fields on BaseNode
retention_periodstringHow long data is retained before deletion
deletion_policystringPolicy governing data deletion
third_party_sharingbooleanWhether data is shared with third parties
idstringrequiredUnique identifier (UUID)
typeNodeTyperequiredDiscriminator for the entity type
titlestringrequiredDisplay name
descriptionstringOptional detailed description
statusstringLifecycle status
tagsstring[]Freeform tags for filtering
5 phases — initial: proposed · template: APPROVAL
3 edge types connected to this entity.
product_bound_by_data_contractcompliance_framework_governs_data_contractdata_contract_governs_data_source