What is a Data Classification?

A sensitivity tier assigned to data, typically public, internal, confidential, or restricted, so that handling rules follow from the label.

What is the purpose of a Data Classification?

Data classification determines how carefully a given dataset must be handled by sorting it into sensitivity tiers. The label drives which controls apply, such as encryption, access limits, monitoring, and retention, and it underpins regimes like GDPR, HIPAA, and CCPA that assume an organisation knows where its sensitive data lives.

How do you use a Data Classification in product management?

Define 3–4 classification levels: Public (can be freely shared), Internal (company use only), Confidential (restricted access), Restricted (highly sensitive, minimum access). Map all data assets to a classification. Apply controls appropriate to each level: encryption, access controls, retention limits. Classify data at creation.

Where does the concept of a Data Classification come from?

Data classification systems formalised with government and military information security in the Cold War era (Top Secret, Secret, Confidential, Unclassified). Commercial equivalents emerged with GDPR and CCPA, which required organisations to know where their sensitive personal data lives before they can protect or manage it.

What are common mistakes with a Data Classification?

Defining classification levels in a policy document while no system actually enforces them means data carries a label that changes nothing about how it's stored or accessed. Teams create so many tiers that nobody can remember which applies, so everything defaults to the least restrictive. Classifying data once at creation and never reclassifying as it's combined or aggregated misses that derived data can be more sensitive than its inputs. A classification scheme with no link to retention, access, and encryption controls is taxonomy theatre, not security.

🍃

Data Classification

Q: What is an example of a Data Classification?

TPC data classification: User graph content: Classification: Confidential. Rationale: contains strategic product plans and competitive intelligence of customers. Controls: encrypted at rest (AES-256) and in transit (TLS 1.3), access restricted to authenticated graph owner and explicit collaborators, audit logged on access, retention: until account deletion + 30 days.

Assigns data to a sensitivity tier so handling rules and access controls follow from the label.

SecurityOperations & Qualitytype: 'data_classification'interface: BaseNode

View in Graph

▼On this page

Description Properties Relationships Graph Position Related Entities

Description

Data classification assigns data to sensitivity tiers, typically public, internal, confidential, and restricted, so that handling rules follow from the label. It sits under access control and compliance, which apply controls according to how sensitive the data has been judged to be.

See moreSee less

Origin & evolution

The tiered model has military and governmental roots in information classification, and the commercial sector settled on a four-level scheme that most organisations now recognise. Public data carries minimal risk, such as press releases and marketing material. Internal data is limited to staff and trusted partners. Confidential data is business-sensitive, covering customer lists and financial reports, with controlled access. Restricted data is the highest tier, holding things like payment-card numbers, medical records, and trade secrets, and it demands the strongest safeguards.

The label is not decorative. Classification levels directly determine which controls apply: permissions, encryption, monitoring, and retention. Regulatory regimes lean on it too, since GDPR, HIPAA, and CCPA all assume an organisation knows where its personally identifiable information (PII) and protected health information (PHI) live before it can claim to handle them lawfully.

How it works in practice

A health-tech product audits its data stores. The marketing blog is tagged public. Internal runbooks are internal. The customer table holds names and emails, tagged confidential. One column holds diagnosis codes, which is PHI, so the whole table inherits restricted. That single classification cascades into policy: the restricted store gets field-level encryption, access is limited to a named on-call group, every read is logged, and the data is excluded from the analytics warehouse copy by default. The label did the deciding; the controls just followed it.

Data Classification vs. its neighbours

Data Source is the store being labelled. Classification is the sensitivity tier applied to that source; the source holds the data, the classification says how carefully to treat it.
Security Policy is the broader set of rules an organisation enforces. Classification is one input that a policy establishes and acts on, turning "restricted" into concrete encryption and access requirements.
Compliance Requirement is an external obligation such as HIPAA. Classification is the internal mechanism that makes the obligation operable, since you meet a PHI rule by first knowing which data is PHI.

In the graph

In the Unified Product Graph, Data Classification sits in the security region. The product applies it through product_classifies_data_with_data_classification, a security policy gives it force through security_policy_establishes_data_classification, and it attaches to the things it governs through data_classification_applies_to_data_source. Modelling it as its own node, rather than a property on a source, lets one tier link to many sources and to the policy that defines it, which mirrors how classification actually works: a single scheme governing the whole estate.

Preview

Presets

title

level

Confidential Sensitive; restricted to a need-to-know basis.

handling_requirementsexamplesretention_period

encryption_required

Data Classification

Workspace Operational Records

LevelConfidentialEncryption requiredtrue

Handling requirementsEncrypted at rest and in transit; accessible only to workspace members and the scoped Builder agent; not used for model training without explicit consent

ExamplesRecords describing team processes, field definitions, automation configurations, and record content created by directors using Trellis

Retention periodDuration of active subscription plus 90 days after cancellation

Properties

Type-specific fields on BaseNode

levelenum

Sensitivity level

Data sensitivity scale →

public

Public

No restriction; safe to disclose openly.

internal

Internal

For internal use; not for external release.

confidential

Confidential

Sensitive; restricted to a need-to-know basis.

restricted

Restricted

Highly sensitive; strict controls and auditing.

handling_requirementsstring

Handling rules

examplesstring[]

Example data covered

retention_periodstring

Retention period

encryption_requiredboolean

Whether encryption is mandatory

Inherited from BaseNode (6 fields)

idstringrequired

Unique identifier (UUID)

typeNodeTyperequired

Discriminator for the entity type

titlestringrequired

Display name

descriptionstring

Optional detailed description

statusstring

Lifecycle status

tagsstring[]

Freeform tags for filtering

Relationships

4 edge types connected to this entity.

Parents

Entities that can contain this type

Productproduct_classifies_data_with_data_classification

Security Policysecurity_policy_establishes_data_classification

Cross-References

Contextual links across the graph

Data Sourcedata_classification_applies_to_data_source

Graph Position

2parents

🍃Data Classification

2cross-ref

Definition

A data classification is a sensitivity tier, such as public, internal, or restricted, that governs how a piece of data must be handled. It connects the security domain to legal and engineering, determining which controls apply to which data.

Usage Guidance

Define 3–4 classification levels: Public (can be freely shared), Internal (company use only), Confidential (restricted access), Restricted (highly sensitive, minimum access).
Map all data assets to a classification.
Apply controls appropriate to each level: encryption, access controls, retention limits.
Classify data at creation.

Anti-Patterns

Defining classification levels in a policy document while no system actually enforces them means data carries a label that changes nothing about how it's stored or accessed.
Teams create so many tiers that nobody can remember which applies, so everything defaults to the least restrictive.
Classifying data once at creation and never reclassifying as it's combined or aggregated misses that derived data can be more sensitive than its inputs.
A classification scheme with no link to retention, access, and encryption controls is taxonomy theatre, not security.

Examples

TPC data classification: User graph content

Classification: Confidential. Rationale: contains strategic product plans and competitive intelligence of customers. Controls: encrypted at rest (AES-256) and in transit (TLS 1.3), access restricted to authenticated graph owner and explicit collaborators, audit logged on access, retention: until account deletion + 30 days.

Data Classification

Assigns data to a sensitivity tier so handling rules and access controls follow from the label.

SecurityOperations & Qualitytype: 'data_classification'interface: BaseNode

View in Graph

▼On this page

Description Properties Relationships Graph Position Related Entities

Description

See moreSee less

Origin & evolution

How it works in practice

Data Classification vs. its neighbours

Data Source is the store being labelled. Classification is the sensitivity tier applied to that source; the source holds the data, the classification says how carefully to treat it.
Security Policy is the broader set of rules an organisation enforces. Classification is one input that a policy establishes and acts on, turning "restricted" into concrete encryption and access requirements.
Compliance Requirement is an external obligation such as HIPAA. Classification is the internal mechanism that makes the obligation operable, since you meet a PHI rule by first knowing which data is PHI.

In the graph

Preview

Presets

title

level

Confidential Sensitive; restricted to a need-to-know basis.

handling_requirementsexamplesretention_period

encryption_required

Data Classification

Workspace Operational Records

LevelConfidentialEncryption requiredtrue

Handling requirementsEncrypted at rest and in transit; accessible only to workspace members and the scoped Builder agent; not used for model training without explicit consent

ExamplesRecords describing team processes, field definitions, automation configurations, and record content created by directors using Trellis

Retention periodDuration of active subscription plus 90 days after cancellation

Properties

Type-specific fields on BaseNode

levelenum

Sensitivity level

Data sensitivity scale →

public

Public

No restriction; safe to disclose openly.

internal

Internal

For internal use; not for external release.

confidential

Confidential

Sensitive; restricted to a need-to-know basis.

restricted

Restricted

Highly sensitive; strict controls and auditing.

handling_requirementsstring

Handling rules

examplesstring[]

Example data covered

retention_periodstring

Retention period

encryption_requiredboolean

Whether encryption is mandatory

Inherited from BaseNode (6 fields)

idstringrequired

Unique identifier (UUID)

typeNodeTyperequired

Discriminator for the entity type

titlestringrequired

Display name

descriptionstring

Optional detailed description

statusstring

Lifecycle status

tagsstring[]

Freeform tags for filtering

Relationships

4 edge types connected to this entity.

Parents

Entities that can contain this type

Productproduct_classifies_data_with_data_classification

Security Policysecurity_policy_establishes_data_classification

Cross-References

Contextual links across the graph

Data Sourcedata_classification_applies_to_data_source

Graph Position

2parents

🍃Data Classification

2cross-ref

Definition

Usage Guidance

Define 3–4 classification levels: Public (can be freely shared), Internal (company use only), Confidential (restricted access), Restricted (highly sensitive, minimum access).
Map all data assets to a classification.
Apply controls appropriate to each level: encryption, access controls, retention limits.
Classify data at creation.

Anti-Patterns

Defining classification levels in a policy document while no system actually enforces them means data carries a label that changes nothing about how it's stored or accessed.
Teams create so many tiers that nobody can remember which applies, so everything defaults to the least restrictive.
Classifying data once at creation and never reclassifying as it's combined or aggregated misses that derived data can be more sensitive than its inputs.
A classification scheme with no link to retention, access, and encryption controls is taxonomy theatre, not security.

Examples

TPC data classification: User graph content