“A.I. Is Getting More Powerful, but Its Hallucinations Are Getting Worse”
“A new wave of ‘reasoning’ systems from companies like OpenAI is producing incorrect information more often. Even the companies don’t know why.”
“[OpenAI] found that o3 — its most powerful system — hallucinated 33 percent of the time when running its PersonQA benchmark test, which involves answering questions about public figures. That is more than twice the hallucination rate of OpenAI’s previous reasoning system, called o1. The new o4-mini hallucinated at an even higher rate: 48 percent.”
Leave a Reply