What is the purpose of an AI Model?

An AI model behaves like a versioned dependency whose weights shift with each retrain, so the same prompt can return different answers across versions. Tracking it as a product component, with documented capabilities, costs, and known limits, surfaces capability and risk and separates a controlled deployment from a black box.

How do you use an AI Model in product management?

Create a node for each model in use (e.g., Claude Sonnet, GPT-4o, custom fine-tune). Track provider, version, cost per call, and which features depend on it.

What are common mistakes with an AI Model?

Pinning to a model version with no plan to re-evaluate when the provider updates it lets behaviour shift silently underneath you. Choosing the largest model by default ignores latency and cost trade-offs that often matter more than a marginal quality gain. Shipping a model with no offline evaluation baseline means there is nothing to compare against when quality regresses in production.

🤖

AI Model

Q: What is an AI Model?

The trained machine-learning artefact a product depends on to generate predictions, text, images, or decisions.

Q: Where does the concept of an AI Model come from?

Documenting a model as a reportable artefact crystallised with Model Cards for Model Reporting, introduced by Margaret Mitchell and colleagues at the 2019 ACM Conference on Fairness, Accountability, and Transparency. A model card records intended use, training data, evaluation metrics by group, and known failure conditions, and the format has since been adopted by Hugging Face, Google, and most foundation-model providers as the default way to ship a model with its provenance attached.

Q: What is an example of an AI Model?

Routing model: A mid-tier model handles 90% of support classifications cheaply; a larger model is reserved for the ambiguous 10%. The product records both as distinct AI Model entities.

A machine learning or AI model used within the product, its version, provider, capabilities, and costs.

AI & Machine LearningEngineering & Platformtype: 'ai_model'interface: BaseNode

View in Graph

▼On this page

Description Properties Lifecycle Relationships Graph Position Related Entities

Description

An AI model is the trained machine-learning artefact a product depends on to generate predictions, text, images, or decisions. It behaves like a versioned dependency with a temperament: its weights encode statistical behaviour that shifts with each retrain, so the same prompt can return different answers across versions. Treating it as a tracked product component, with documented capabilities and known limits, is what separates a controlled deployment from a black box.

See moreSee less

Origin & evolution

The discipline of documenting a model as a reportable artefact crystallised with Model Cards for Model Reporting, introduced by Margaret Mitchell and colleagues (including Timnit Gebru) at the 2019 ACM Conference on Fairness, Accountability, and Transparency. A model card records intended use, training data, evaluation metrics broken down by group, and known failure conditions. The format has since been adopted by Hugging Face, Google, and most foundation-model providers as the default way to ship a model with its provenance attached.

The foundation-versus-fine-tuned split followed the rise of large pre-trained models. A foundation model is trained once at scale; teams then fine-tune or prompt it for a specific task. That changes the dependency graph: a product may rely on a vendor's base model and a separate fine-tune layer, each with its own version and eval history.

How it works in practice

A support product routes tickets through claude-sonnet-4 for triage. The team pins the version, records its evaluation score against a 500-ticket benchmark (87 per cent correct routing), and tracks cost per thousand tokens. When the vendor releases a newer checkpoint, they re-run the same benchmark before switching: the new model scores 91 per cent on routing but regresses on multilingual tickets, so they hold the upgrade for non-English queues. The model card makes that decision auditable.

AI Model vs. its neighbours

AI Dataset is the corpus a model learns from. The dataset is the input to training; the model is the trained output. Contamination or licensing problems in the dataset surface later as model behaviour, which is why the two are tracked as separate, connected assets.
AI Guardrail sits around the model at runtime, filtering inputs and outputs. The model decides what to say; the guardrail decides what is allowed through. Swapping the model does not change the guardrail, and vice versa.
Eval Benchmark is the measuring instrument. The benchmark is fixed; the model's score against it is what moves. Comparing two models means holding the benchmark constant.

In the graph

In the Unified Product Graph, an AI model lives in the AI and intelligence region as a first-class dependency. A product reaches it through product_powered_by_ai_model; its behaviour is shaped by ai_model_prompted_via_prompt_version, measured by ai_model_benchmarked_by_eval_benchmark, and metered by ai_model_costed_by_ai_cost_tracker. Modelling the version, eval, and cost as distinct edges means an upgrade decision can be traced end to end: which prompts break, which benchmark moved, and what the new spend looks like.

Preview

Presets

title

model_provider

model_idmodel_versionmodel_purposecontext_windowlatency_p50_mslatency_p99_msinput_schemaoutput_schemaaliasestags

AI Model

Claude Sonnet 4 for Trellis Builder agent

Model provideranthropic

Model idclaude-sonnet-4-20250514

Model versionclaude-sonnet-4-20250514

Model purposePropose structured internal tools from Nora's plain-language process descriptions

Context window200000

Latency p50 ms1400

Latency p99 ms6200

Input schemaPlain-language process description plus workspace schema snapshot

Output schemaStructured tool proposal: records, views, automations, and a change explanation

AliasesSonnet 4, claude-sonnet-4

Tagsbuilder, tool-proposal, structured-output

Properties

Type-specific fields on BaseNode

model_providerenum

Provider or vendor

anthropicopenaigooglemetamistralcustom

model_idstring

Unique model identifier (e.g. "claude-sonnet-4-20250514")

model_versionstring

Specific version

model_purposestring

Intended use case

context_windownumber

Maximum context window (tokens)

latency_p50_msnumber

Median latency (p50, ms)

latency_p99_msnumber

Tail latency (p99, ms)

input_schemastring

Expected input format or schema

Show all 11 properties

output_schemastring

Expected output format or schema

aliasesstring[]

Alternative names

tagsstring[]

Free-form classification tags

Inherited from BaseNode (6 fields)

idstringrequired

Unique identifier (UUID)

typeNodeTyperequired

Discriminator for the entity type

titlestringrequired

Display name

descriptionstring

Optional detailed description

statusstring

Lifecycle status

tagsstring[]

Freeform tags for filtering

Lifecycle

5 phases, initial: evaluating

All lifecycles

Relationships

16 edge types connected to this entity.

Parents

Entities that can contain this type

Productproduct_powered_by_ai_model

Children

Entities this type can contain

Prompt Templateai_model_defines_prompt_template

Eval Benchmarkai_model_benchmarked_by_eval_benchmark

AI Cost Trackerai_model_costed_by_ai_cost_tracker

Hallucination Reportai_model_flagged_by_hallucination_report

AI Guardrailai_model_constrained_by_ai_guardrail

Model Comparisonai_model_compared_in_model_comparison

AI Experimentai_model_evaluated_through_ai_experiment

AI Datasetai_model_trained_on_ai_dataset

AI Traceai_model_produces_ai_trace

Cross-References

Contextual links across the graph

Eval Runeval_run_evaluates_ai_model

Model Comparisonmodel_comparison_compares_ai_model

Agent Definitionagent_definition_uses_ai_model

AI Experimentai_experiment_based_on_ai_model

AI Experimentai_experiment_uses_ai_model

Model Comparisonmodel_comparison_winner_is_ai_model

Graph Position

1parent

🤖AI Model

9children

6cross-ref

Definition

An AI model is a trained machine-learning artefact the product depends on, recorded with its version, provider, capabilities, and costs. Tracking it makes the cost, capability, and risk it introduces visible as a product dependency.

Usage Guidance

Create a node for each model in use (e.
g.
, Claude Sonnet, GPT-4o, custom fine-tune).
Track provider, version, cost per call, and which features depend on it.

Anti-Patterns

Pinning to a model version with no plan to re-evaluate when the provider updates it lets behaviour shift silently underneath you.
Choosing the largest model by default ignores latency and cost trade-offs that often matter more than a marginal quality gain.
Shipping a model with no offline evaluation baseline means there is nothing to compare against when quality regresses in production.

Examples

Routing model

A mid-tier model handles 90% of support classifications cheaply; a larger model is reserved for the ambiguous 10%. The product records both as distinct AI Model entities.

Pinned production model

A model version is pinned in production so behaviour stays stable, while a newer version is evaluated in parallel before any switch.

AI Model

A machine learning or AI model used within the product, its version, provider, capabilities, and costs.

AI & Machine LearningEngineering & Platformtype: 'ai_model'interface: BaseNode

View in Graph

▼On this page

Description Properties Lifecycle Relationships Graph Position Related Entities

Description

See moreSee less

Origin & evolution

How it works in practice

AI Model vs. its neighbours

AI Dataset is the corpus a model learns from. The dataset is the input to training; the model is the trained output. Contamination or licensing problems in the dataset surface later as model behaviour, which is why the two are tracked as separate, connected assets.
AI Guardrail sits around the model at runtime, filtering inputs and outputs. The model decides what to say; the guardrail decides what is allowed through. Swapping the model does not change the guardrail, and vice versa.
Eval Benchmark is the measuring instrument. The benchmark is fixed; the model's score against it is what moves. Comparing two models means holding the benchmark constant.

In the graph

Preview

Presets

title

model_provider

model_idmodel_versionmodel_purposecontext_windowlatency_p50_mslatency_p99_msinput_schemaoutput_schemaaliasestags

AI Model

Claude Sonnet 4 for Trellis Builder agent

Model provideranthropic

Model idclaude-sonnet-4-20250514

Model versionclaude-sonnet-4-20250514

Model purposePropose structured internal tools from Nora's plain-language process descriptions

Context window200000

Latency p50 ms1400

Latency p99 ms6200

Input schemaPlain-language process description plus workspace schema snapshot

Output schemaStructured tool proposal: records, views, automations, and a change explanation

AliasesSonnet 4, claude-sonnet-4

Tagsbuilder, tool-proposal, structured-output

Properties

Type-specific fields on BaseNode

model_providerenum

Provider or vendor

anthropicopenaigooglemetamistralcustom

model_idstring

Unique model identifier (e.g. "claude-sonnet-4-20250514")

model_versionstring

Specific version

model_purposestring

Intended use case

context_windownumber

Maximum context window (tokens)

latency_p50_msnumber

Median latency (p50, ms)

latency_p99_msnumber

Tail latency (p99, ms)

input_schemastring

Expected input format or schema

Show all 11 properties

output_schemastring

Expected output format or schema

aliasesstring[]

Alternative names

tagsstring[]

Free-form classification tags

Inherited from BaseNode (6 fields)

idstringrequired

Unique identifier (UUID)

typeNodeTyperequired

Discriminator for the entity type

titlestringrequired

Display name

descriptionstring

Optional detailed description

statusstring

Lifecycle status

tagsstring[]

Freeform tags for filtering

Lifecycle

5 phases, initial: evaluating

All lifecycles

Relationships

16 edge types connected to this entity.

Parents

Entities that can contain this type

Productproduct_powered_by_ai_model

Children

Entities this type can contain

Prompt Templateai_model_defines_prompt_template

Eval Benchmarkai_model_benchmarked_by_eval_benchmark

AI Cost Trackerai_model_costed_by_ai_cost_tracker

Hallucination Reportai_model_flagged_by_hallucination_report

AI Guardrailai_model_constrained_by_ai_guardrail

Model Comparisonai_model_compared_in_model_comparison

AI Experimentai_model_evaluated_through_ai_experiment

AI Datasetai_model_trained_on_ai_dataset

AI Traceai_model_produces_ai_trace

Cross-References

Contextual links across the graph

Eval Runeval_run_evaluates_ai_model

Model Comparisonmodel_comparison_compares_ai_model

Agent Definitionagent_definition_uses_ai_model

AI Experimentai_experiment_based_on_ai_model

AI Experimentai_experiment_uses_ai_model

Model Comparisonmodel_comparison_winner_is_ai_model

Graph Position

1parent

🤖AI Model

9children

6cross-ref

Definition

Usage Guidance

Create a node for each model in use (e.
g.
, Claude Sonnet, GPT-4o, custom fine-tune).
Track provider, version, cost per call, and which features depend on it.

Anti-Patterns

Pinning to a model version with no plan to re-evaluate when the provider updates it lets behaviour shift silently underneath you.
Choosing the largest model by default ignores latency and cost trade-offs that often matter more than a marginal quality gain.
Shipping a model with no offline evaluation baseline means there is nothing to compare against when quality regresses in production.

Examples

Routing model

A mid-tier model handles 90% of support classifications cheaply; a larger model is reserved for the ambiguous 10%. The product records both as distinct AI Model entities.

Pinned production model

A model version is pinned in production so behaviour stays stable, while a newer version is evaluated in parallel before any switch.