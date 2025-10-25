Facepalm: Tech giants are investing heavily in generative AI chatbots and integrating them across their platforms despite their impact on internet traffic and concerns about accuracy. A recent study has found that, while chatbots are becoming more accurate, they still get information from news outlets wrong nearly half the time.

Analysis from the BBC and other European news outlets has found that around 45 percent of AI chatbot responses based on news articles contain errors. The findings have potentially severe implications as tech platforms continue promoting them.

OpenAI, Google, Microsoft, and other companies are encouraging users to interact with the internet through AI chatbots and other tools designed to summarize information and automate analysis. While AI developers have spent years minimizing hallucinations, evidence indicates they have a long way to go – and that's assuming the problem is even solvable.

The BBC and 22 public media organizations in 18 countries and 14 languages gave chatbots access to their content. When queried about specific stories, they found issues with nearly half of all AI-generated output. These included inaccurate sentences, misquotes, or outdated information, but sourcing was the biggest challenge.

Chatbots often provided links that did not match the sources they cited. Even when they accurately referenced material, they frequently cannot distinguish opinion from fact or differentiate satire from regular news.

Aside from introducing factual errors or misattributing quotes, chatbots can be slow to update information on political figures and other leaders. For example, ChatGPT, Copilot, and Gemini incorrectly stated that Pope Francis is the current pope after Leo XIV had succeeded him. Copilot even reported Francis's date of death correctly while still describing him as the current pope. ChatGPT also gave outdated responses when naming the current German chancellor and NATO's secretary-general.

These inaccuracies persisted across languages and regions. Furthermore, Google's Gemini is far less accurate than ChatGPT, Copilot, and Perplexity, with significant sourcing errors in 72 percent of its responses.

At one time, OpenAI blamed errors like these on early versions of ChatGPT being trained only on information up to September 2021 and lacking access to the live internet. That is no longer the case, so these errors should theoretically not happen anymore – suggesting the issue may be inherent in the algorithms and not something easily fixed.

However, these more recent results show improvement compared to a study the BBC conducted alone in February. Since then, the portion of responses with serious errors fell from 51 to 37 percent, but Gemini still lags far behind.

Despite the poor results, researchers also found that a concerning portion of the public trusts AI-generated answers. More than one-third of British adults and nearly half of adults under 35 trust AI to accurately summarize the news. Moreover, if an AI misrepresents a news outlet's content, 42% of adults would either blame both the AI and the original source or trust the source less. If these problems persist, the increasing popularity of generative AI tools could seriously damage news outlets' credibility.