A directional path describing how data moves between systems, services, or components.
A data flow describes how information moves between the parts of a system: where it originates, what transforms it, where it rests, and where it leaves. It answers a question that an architecture diagram of boxes and arrows usually leaves implicit, which is what actually travels along each connection. Once you can see the flow, a surprising amount follows from it, including where data could leak and where its lineage breaks.
The data-flow diagram came out of structured analysis in the 1970s. Larry Constantine is credited with the original idea, and the notation reached the mainstream through Tom DeMarco's 1978 book Structured Analysis and System Specification and Edward Yourdon, with a parallel notation from Chris Gane and Trish Sarson. The diagram has four elements: external entities, processes, data stores, and the data flows between them. Its central move was to model a system by the data it moves and the transformations it applies, deferring the question of control sequence to a later stage.
That deferral marks the line that still matters: data flow versus control flow. A data-flow view shows what information goes where and how it is reshaped. A control-flow view, the territory of flowcharts and control-flow graphs, shows the order in which steps execute and which branch is taken. The same program has both, and structured analysis chose data flow first on the argument that the shape of the data outlasts the shape of the procedure.
Data-flow modelling found a durable second life in security. Microsoft's STRIDE threatThreatSecurityA specific security threatView reference →-modelling method begins by drawing a data-flow diagram and overlaying trust boundaries, the lines where data crosses a privilege level, because most threats live where data moves across such a boundary. The same flow view underpins data lineageData LineageData & AnalyticsA record of data origin and transformationsView reference →, where the question is which upstream source a given field came from and what touched it on the way.
A team maps the data flow for a checkout. The shopper is the external entity; a flow carries a payment token from the browser, across a trust boundary, into the order service; a process validates it; a data store records the order; another flow forwards an anonymised event to analytics. Drawing it surfaces two findings a box diagram had hidden. The raw card token crosses into a logging sink that was never meant to hold payment data, a disclosure riskRiskComplianceA risk to the product or businessView reference → the trust-boundary overlay makes obvious. And the analytics flow strips the customer ID, which silently breaks lineage: a later "which orders did this user place?" query cannot be answered from the warehouse. Both are properties of the flow, invisible until the flow is drawn.
data_lineageData LineageData & AnalyticsA record of data origin and transformationsView reference → is the audit trail; a data flow is the design of the pathways that produce it. Flow is the map; lineage is the recorded journey along it.data_flow_transports_domain_entityData FlowtransportsDomain Entitycausal, so the entity is the cargo and the flow is the route. The same entity can move along many flows.bounded_contextBounded ContextEngineeringA DDD bounded context defining a service boundaryView reference → is where data lives with shared meaning; a data flow is what crosses between contexts, which is exactly where translation and trust boundaries sit.In the Unified Product Graph, Data FlowEngineeringA data flow between systems sits in the architecture region as a connective structure, a pathway through the system rather than an owned component. Products and contexts move through it via data_flowProductflows throughData Flowhierarchy and product_flows_through_data_flowBounded Contextflows throughData Flowhierarchy; it carries structured payloads through bounded_context_flows_through_data_flowData FlowtransportsDomain Entitycausal; and concrete interfaces join it through data_flow_transports_domain_entityAPI Endpointparticipates inData Flowsemantic. Modelling flow as first-class makes the cross-cutting questions answerable in one place: where a api_endpoint_participates_in_data_flowDomain EntityEngineeringA DDD domain entityView reference → travels, which domain_entityAPI EndpointEngineeringA specific API endpointView reference → touches it, and where a path crosses a context boundary and so deserves a closer look at security and lineage.api_endpoint
Type-specific fields on BaseNode
triggerstringWhat triggers the flow
data_typestringType of data transferred
directionstringDirection
protocolstringCommunication protocol
idstringrequiredUnique identifier (UUID)
typeNodeTyperequiredDiscriminator for the entity type
titlestringrequiredDisplay name
descriptionstringOptional detailed description
statusstringLifecycle status
tagsstring[]Freeform tags for filtering
4 edge types connected to this entity.
product_flows_through_data_flowbounded_context_flows_through_data_flowdata_flow_transports_domain_entityapi_endpoint_participates_in_data_flow