Overview Hallucinations in multimodal large language models (MLLMs)---where the model generates content inconsistent with the input image---pose significant risks in real-world applications, from misinformation in visual question answering to unsafe errors in decision-making. Existing benchmarks…