Skip to content
Made For Builders iconoMade For Builders

DefinedTerm · Glossary

What Is an LLM Hallucination

LLM hallucination is the phenomenon by which a large language model generates grammatically correct and plausible-sounding content that is factually wrong, unverifiable, or contradictory to reality. It is classified into factuality hallucination (the model asserts something false) and faithfulness hallucination (the model ignores or contradicts the supplied context). It is the structural limitation most relevant to evaluating the reliability of generative search engines and the sources they cite.

edu-lopez-paradaPublicado Actualizado

Full definition

An LLM hallucination is the generation of text that appears coherent and well-formed but contains false, unverifiable, or internally contradictory information. The term is borrowed from psychology, where it designates the perception of something that does not exist; applied to language models, it describes the same phenomenon: the model produces, with apparent confidence, assertions that have no basis in verifiable facts or in the context provided.

The most widely adopted taxonomy in academic literature (Zhang et al., ACM TOIS, 2025) distinguishes two main categories:

Factuality hallucination. The model asserts something that contradicts verifiable real-world facts. This manifests as direct factual contradiction (stating incorrect data about a real company) or factual fabrication (inventing a study, a quote, or a statistic that does not exist).

Faithfulness hallucination. The model deviates from the context or instructions provided. This subdivides into instruction inconsistency (ignoring part of the user's instruction), context inconsistency (contradicting information included in the prompt), and logical inconsistency (generating reasoning that contradicts itself internally).

Why it matters in 2026

Hallucination is the limitation that most undermines trust in generative engines for purchase or hiring decisions. A user who asks about home renovation companies or plumbers in their area may receive references to non-existent businesses, fabricated pricing data, or invented certifications.

For companies that work on their presence in AI search, hallucination carries an equally relevant inverse implication: if the information about a company in external sources is scarce or contradictory, the model has a higher probability of hallucinating when citing it, attributing incorrect services or locations. Entity clarity and coverage in verifiable media are the primary defenses against this risk.

A study published in MDPI (2025) reviewing more than 300 hallucination mitigation papers confirms that RAG (Retrieval-Augmented Generation) is the most widely adopted strategy for reducing hallucinations, by anchoring model responses in fragments retrieved from verified external sources.

How it works

The causes of hallucination are distributed across the entire model lifecycle:

Pre-training data. If the training corpus contains incorrect, outdated, or biased information, the model incorporates that error into its parametric knowledge. Knowledge also has a cutoff date: any event after the cutoff does not exist for the model unless provided via RAG.

Training and fine-tuning. The next-token prediction objective favors coherence and fluency over factual accuracy. During fine-tuning with human feedback (RLHF or DPO), evaluators may inadvertently reinforce plausible but incorrect responses.

Inference. High-temperature stochastic decoding strategies increase response diversity but also raise hallucination probability. When the model reaches the boundary of its knowledge on a topic, it tends to extrapolate rather than acknowledge uncertainty.

Difference from other AI errors

Error typeDescriptionExample
Factual hallucinationFalse assertion stated with confidenceCiting a non-existent study
Faithfulness hallucinationIgnoring or contradicting the supplied contextSummarizing the opposite of what a document says
BiasSystematic preference for certain groupsCiting only sources from one language or region
Reasoning errorFlaw in logical chain, not in the base factCalculating incorrectly from correct data
StalenessCorrect information at the time, now outdatedCiting legislation that has since been repealed

The most relevant distinction for AI search users is that hallucination is presented with the same confident tone as correct information, with no visible signal of uncertainty. This sets it apart from a simple arithmetic error, where the model may acknowledge doubt, and explains why RAG mechanisms and explicit citation are necessary for high-reliability systems.

Related terms

RAG (Retrieval-Augmented Generation), Citability, Fan-out query.

Fuentes

Términos relacionados

  • rag-retrieval-augmented-generation
  • citabilidad-llm
  • fan-out-query