AI Hallucinations Linked to Incorrect Model Calibrations, According to OpenAI Research

AI hallucinations recounted in a July 2024 article presented a probability no longer up for debate. The AI GenAI, popular for its innovative output, had been inadvertently captivating audiences with imaginative citations, aligning with stereotyped imagery, and contortions that resembled...

, and Administrator

2025 September 16 . 6:29 AM

2 min read

AI Hallucinations: OpenAI research links hallucinations to improper measurement in AI models

AI Hallucinations Linked to Incorrect Model Calibrations, According to OpenAI Research

OpenAI, a leading research organisation in artificial intelligence (AI), has published a new research study that aims to tackle the issue of AI hallucinations.

The study highlights the importance of evaluating AI systems in a way that discourages fabrication and rewards caution. Currently, the system rewards models that guess correctly, even when uncertain, more than those that choose not to answer. This incentivises AI systems to bluff, particularly larger models with partial knowledge.

The researchers found that hallucinations, or the generation of inaccurate information, can be traced back to the foundations of pretraining. AI models learn by predicting the next word in massive datasets without exposure to examples labeled as "false." This lack of explicit guidance on what is true and what is not contributes to the problem.

AI systems, despite their power, are only as reliable as the information they are grounded in or as transparent about their uncertainty. A chatbot's polished tone is no guarantee of truth, as some questions are inherently unanswerable. Hallucinations are expected artifacts of next-word prediction, according to the study.

In high-stakes contexts like healthcare or legal advice, a model that admits uncertainty is safer than one that invents answers. Treating AI models like a clever but overconfident friend is advised, as they need fact-checking when they say something that seems too good to be true.

OpenAI proposes improving AI evaluation by developing advanced reasoning models that enhance problem-solving accuracy and reliability. These include o3, o4-mini, and o3-pro. Additionally, evaluation methods involving comparing generated outputs with high-quality reference texts, such as BLEU and ROUGE scores, are used to improve text quality assessment and thus reduce hallucinations.

The new evaluation system aims to encourage models to admit uncertainty more often. This is achieved by redesigning evaluation scoreboards, penalising confident errors more than non-response, and giving partial credit for uncertainty. Smaller models sometimes outperform larger ones in humility, as a small model that knows it doesn't understand Māori can simply admit ignorance.

The study argues that the AI hallucination problem isn't just about the models, but also about the way we measure them. The current AI evaluations reward models that guess correctly when uncertain, which contributes to the problem. Calibration, or knowing what you don't know, is not a brute-force problem solvable only with trillion-parameter giants, according to the study.

AI hallucinations are not disappearing overnight and will never achieve 100% accuracy. However, with this new research, OpenAI aims to make significant strides in reducing hallucinations and improving the reliability of AI systems. The duality of AI hallucinations remains, as they can inspire surreal art or imaginative leaps, but in day-to-day knowledge-based work, they remain landmines.

Latest

This picture shows a couple of men playing table tennis and we see couple of them watching by...

Spin Your Way to Fortune!

WSOP 2025 Super High Roller Concludes with $15.6M Prize Pool

13 of the world's top 20 poker players battled it out. Now, the final table is set with a $15.6M prize pool and Thomas Boivin in the lead.

, and Administrator

2025 October 9

In the picture there is a sports player,he is posing for the photograph and on his shirt there are...

Spin Your Way to Fortune!

Former Star Nuri Sahin's Net Worth Tops €15M

From the pitch to the dugout, Sahin's success has translated into a substantial net worth. Discover how his career and family support have contributed to his wealth.

, and Administrator

2025 October 9

This picture shows a woman playing tennis with the tennis bat.

Finance

Australia's Lockdowns Drive Surge in Golden Visa Applications for Portugal

Tired of lockdowns, Australians are turning to Portugal's Golden Visa. The program offers a route to European residency and citizenship, attracting billions in investment since 2012.

, and Administrator

2025 October 9

AI Hallucinations Linked to Incorrect Model Calibrations, According to OpenAI Research

AI Hallucinations Linked to Incorrect Model Calibrations, According to OpenAI Research

Read also:

Related

Latest