Cutting corners: Most search engines now present users with AI-generated overviews by default, sparking controversy over concerns about accuracy and lost click-through traffic. While testing suggests that Google's AI overviews are accurate most of the time, the enormous volume of queries the search engine processes each day likely still results in millions of incorrect responses.
According to The New York Times, testing suggests that approximately one in 10 Google AI search overviews contains false information. Given that the search engine processes roughly 5 trillion queries per year, users could be exposed to more than 57 million inaccurate answers each hour – nearly 1 million per minute.
The figures come from AI startup Oumi, which the Times asked to evaluate Gemini's accuracy using SimpleQA, a widely used generative AI benchmark. After analyzing 4,326 Google searches, Oumi found that Google's AI assistant, Gemini version 2, produced accurate overviews 85 percent of the time in October. By February, Gemini 3 had improved that figure to 91 percent.

However, Oumi can evaluate large volumes of results only by relying on AI tools, which may also introduce errors. In addition, Google sometimes generates different AI overviews for the same query, even when it is repeated seconds apart.
A Google spokesperson called Oumi's testing flawed, arguing that it does not reflect real-world search behavior. The company's internal testing indicates that Gemini 3, when operating independently of Google Search, hallucinates 28 percent of the time.
Sourcing presents another challenge. Google attempts to support its AI overview results with relevant links, but those sources often do not substantiate Gemini's claims – whether accurate or not.
In some cases, an incorrect AI overview is immediately followed by a link containing correct information; in others, an accurate overview cites a source with inaccurate information; and sometimes the linked pages contain no relevant information at all. Notably, discrepancies between AI overviews and their sources increased after the February update, rising from 37 percent of searches with Gemini 2 to 56 percent with Gemini 3.

Researchers also found that AI overviews are susceptible to manipulation. In one example, a BBC journalist published a blog post containing false information and later found that Google repeated those claims the following day.
Tellingly, Google and other AI companies acknowledge the technology's tenuous relationship with the truth in the fine print. Microsoft's terms of service describe its Copilot AI tool as intended for entertainment purposes, not for making important decisions. Google's AI overviews advise users to double-check responses, while xAI acknowledges that hallucinations can occur.