BBC study finds AI chatbots still get news wrong 45% of the time

Daniel Sims

Posts: 2,422   +73
Staff
Facepalm: Tech giants are investing heavily in generative AI chatbots and integrating them across their platforms despite their impact on internet traffic and concerns about accuracy. A recent study has found that, while chatbots are becoming more accurate, they still get information from news outlets wrong nearly half the time.

Analysis from the BBC and other European news outlets has found that around 45 percent of AI chatbot responses based on news articles contain errors. The findings have potentially severe implications as tech platforms continue promoting them.

OpenAI, Google, Microsoft, and other companies are encouraging users to interact with the internet through AI chatbots and other tools designed to summarize information and automate analysis. While AI developers have spent years minimizing hallucinations, evidence indicates they have a long way to go – and that's assuming the problem is even solvable.

The BBC and 22 public media organizations in 18 countries and 14 languages gave chatbots access to their content. When queried about specific stories, they found issues with nearly half of all AI-generated output. These included inaccurate sentences, misquotes, or outdated information, but sourcing was the biggest challenge.

Chatbots often provided links that did not match the sources they cited. Even when they accurately referenced material, they frequently cannot distinguish opinion from fact or differentiate satire from regular news.

Aside from introducing factual errors or misattributing quotes, chatbots can be slow to update information on political figures and other leaders. For example, ChatGPT, Copilot, and Gemini incorrectly stated that Pope Francis is the current pope after Leo XIV had succeeded him. Copilot even reported Francis's date of death correctly while still describing him as the current pope. ChatGPT also gave outdated responses when naming the current German chancellor and NATO's secretary-general.

These inaccuracies persisted across languages and regions. Furthermore, Google's Gemini is far less accurate than ChatGPT, Copilot, and Perplexity, with significant sourcing errors in 72 percent of its responses.

At one time, OpenAI blamed errors like these on early versions of ChatGPT being trained only on information up to September 2021 and lacking access to the live internet. That is no longer the case, so these errors should theoretically not happen anymore – suggesting the issue may be inherent in the algorithms and not something easily fixed.

However, these more recent results show improvement compared to a study the BBC conducted alone in February. Since then, the portion of responses with serious errors fell from 51 to 37 percent, but Gemini still lags far behind.

Despite the poor results, researchers also found that a concerning portion of the public trusts AI-generated answers. More than one-third of British adults and nearly half of adults under 35 trust AI to accurately summarize the news. Moreover, if an AI misrepresents a news outlet's content, 42% of adults would either blame both the AI and the original source or trust the source less. If these problems persist, the increasing popularity of generative AI tools could seriously damage news outlets' credibility.

Permalink to story:

 
Given how unreliable a source BBC is, I'm more inclined to think their claim is wrong, rather than LLMs being wrong 45% of the time.
 
Given how unreliable a source BBC is, I'm more inclined to think their claim is wrong, rather than LLMs being wrong 45% of the time.
Given how unreliable a source BBC is, I'm more inclined to think their claim is wrong, rather than LLMs being wrong 45% of the time.

Yes, I much prefer getting my news sources from Truth Social, must more reliable.
 
Do you want to feel informed?
Read a newspaper.

Do you want to be desinformed?
Read another one.

Do you want to be totatly screwed?
Ask chatbot.
 
Seems ironic considering the previous weekend a bunch of verified users cited an unvoted AI note without even verifying it's sources, and kept arguing with grok in the comments

I don't understand this social media trend but it seems like social media does everything to not promote any actual real discussion or thought while fake stuff runs rampant 24/7 to take advantage of it
 
If you close your mind to just listening to one single type of media network no wonder you "still waiting" for that to happen.
That's why I watch and listen to the BBC, the arbiters of truth, the very definition of journalism, the bastion of integrity and the network that reported the collapse of WTC 7 12 minutes before it did.
 
Did you know that California governor Gavin Newscum gets 99% wrong and he is still running the state? I think 45% is pretty forgivable.
 
Back