A documented instance where an AI model produced factually incorrect, fabricated, or misleading output.
A hallucination report is a logged instance of a language model producing output that is false or fabricated: an invented citation, a confident wrong number, a claim the source never made. The report turns a one-off failure into data. A single hallucination is an anecdote; a corpus of reports is the evidenceEvidenceValidationData supporting or refuting a hypothesisView reference → that drives evals and guardrails.
The term entered NLP through image captioning and translation research, where models described objects that were not in the input. The framing that stuck came from Ji et al.'s Survey of Hallucination in Natural Language Generation (ACM Computing Surveys, 2022), which split the phenomenon in two. An intrinsic hallucination contradicts the source the model was given; an extrinsic one cannot be verified from the source at all, true or false. The distinction matters operationally, because the two demand different fixes: intrinsic errors point at how the model uses context, extrinsic ones at what it should refuse to assert.
Later work refined the axis into factuality versus faithfulness, separating disagreement with the real world from disagreement with the provided input. Recent surveys, including a 2025 taxonomy paper, formalise these definitions further. For a product team the value is practical: a report tagged by type routes to the right remedy and feeds a measurable eval rather than a vague sense that the model "makes things up".
A support assistant tells a user a refund window is 60 days when the policy document loaded into context says 30. A reviewer files a hallucination report: type intrinsic, because it contradicts the supplied source; severity high, because it misstates policy to a customer. Twenty similar reports accumulate over a month. They become a regression set in the next Eval RunAI & Machine LearningAn evaluation run against a benchmarkView reference →, and the failure rate on that set drops from 8 percent to under 1 after a retrieval-grounding fix. The report closed the loop from a single bad answer to a guardrail and a tracked metricMetricStrategyA unified metric that measures progress, health, or behaviour across the productView reference →.eval_run
ai_model_flagged_by_hallucination_reportAI Modelflagged byHallucination Reporthierarchy attaches the failure to the specific model and version, so a regression after an upgrade is traceable to the change that caused it.In the Unified Product Graph, a hallucination report sits in the AI region and attaches to its source through AI Modelflagged byHallucination Reporthierarchy. Logging each instance as a node keeps a model's failure record durable and queryable, so reports can be counted by type, fed into an ai_model_flagged_by_hallucination_reportEval RunAI & Machine LearningAn evaluation run against a benchmarkView reference →, and used to justify an eval_runAI GuardrailAI & Machine LearningA guardrail for AI safetyView reference →. The structure turns scattered "the model got this wrong" complaints into an auditable trail from observed failure to measured fix.ai_guardrail
Type-specific fields on BaseNode
report_typestringClassification
severityobjectImpact severity (1 = trivial, 5 = dangerous misinformation)
user_facingbooleanVisible to end users
root_causestringIdentified cause
remediationstringRemediation steps
idstringrequiredUnique identifier (UUID)
typeNodeTyperequiredDiscriminator for the entity type
titlestringrequiredDisplay name
descriptionstringOptional detailed description
statusstringLifecycle status
tagsstring[]Freeform tags for filtering
4 phases — initial: reported
1 edge type connected to this entity.
ai_model_flagged_by_hallucination_report