AI Systems Experiencing Increased Hallucinations (Uncertainty Remains About Causes)
Hallucinating AI: A Never-Ending Nightmare
Hallucinations continue to haunt generative AI models. What makes them so creative sometimes, also makes them inclined to lie or fabricate reality. And worse, the problem isn't easing up as AI advances - it's getting worse.
The Frightening Statements from OpenAI's Report
In a chilling report from OpenAI, their latest o3 and o4-mini models are found to hallucinate 51% and 79% respectively on an AI benchmark known as SimpleQA. The earlier o1 model has a hallucination rate of 44%. These figures are alarming and moving in the wrong direction. These reasoning models, known for their thoughtful answers and slow delivery, are clearly leaving more room for errors and inaccuracies.
The Wrong Predictions
Mistakes aren't exclusive to OpenAI and ChatGPT. I easily managed to make Google's AI Overview search bot erroneous. AI's inability to properly pull out information from the web has been widely documented. Recently, a support bot for AI coding app Cursor announced a non-existent policy change.
Hidden in the Shadows
You won't find many mentions of these hallucinations in the AI industry's grand announcements. Along with energy consumption and copyright infringement, hallucinations are topics the big names in AI prefer to stay clear of.
My Personal Experience
Personally, I haven't noticed too many inaccuracies when using AI search and bots. The error rate is certainly nowhere near 79%, but mistakes are made. However, it seems this is a problem that might never vanish, especially as the teams working on these AI models don't fully understand why hallucinations occur.
Vectera's Slightly Better Results
In tests conducted by AI platform developer Vectera, many models showed hallucination rates of 1-3%. OpenAI's o3 model stands at 6.8%, with the smaller o4-mini at 4.6%. This is somewhat better than my personal experiences with these tools, but even a low number of hallucinations can translate to major issues - particularly as we delegate more tasks to these AI systems.
The Mystery Behind Hallucinations
No one truly knows how to stop hallucinations or fully grasp their root causes. These models aren't built to follow rules set by programmers but to choose their own way of working and responding. Vectara's CEO, Amr Awadallah, told the New York Times that AI models will "always hallucinate," and these problems will "never go away."
The Missing Pieces of the Puzzle
University of Washington Professor Hannaneh Hajishirzi, who is working on methods to reverse engineer answers from AI, explained to the NYT that "we still don't know how these models work exactly." Just like troubleshooting a car or computer issue, you need to understand the problem to fix it.
Adding Fuel to the Flames
Researcher Neil Chowdhury, from AI analysis lab Transluce, believes that the way reasoning models are built could be exacerbating the problem. "Our hypothesis is that the kind of reinforcement learning used for o-series models may amplify issues that are usually mitigated (but not fully erased) by standard post-training pipelines," he told TechCrunch.
OpenAI's Performance Report
In OpenAI's own performance report, a lack of "world knowledge" is mentioned, while it's noted that the o3 model tends to make more claims than its predecessor, leading to more hallucinations. Ultimately, "more research is needed to understand the cause of these results," according to OpenAI.
Glimmers of Hope
Researchers are actively working on finding a solution to hallucinations in generative AI. Combining AI outputs with human oversight or fact-checking can provide an additional layer of verification to combat hallucinations. Research into new algorithms and techniques to better distinguish generated and real content is also ongoing.
Despite these efforts, the inherent nature of generative AI ensures that some level of hallucination is likely to persist. Thus, ongoing research and development are crucial for improving the reliability of AI models.
Disclosure: Lifehacker's parent company, Ziff Davis, filed a lawsuit against OpenAI in April, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.
- The mystery behind why AI hallucinates remains unsolved, as these models are designed to choose their own way of working and responding, rather than strictly adhering to rules set by programmers.
- Universities and research labs are actively working on finding solutions to hallucinations in generative AI, exploring methods like combining AI outputs with human oversight or fact-checking, and researching new algorithms to better distinguish generated and real content.
- Despite these efforts, the inherent nature of generative AI ensures that some level of hallucination is likely to persist, making ongoing research and development crucial for improving the reliability of AI models.

