A groundbreaking study has exposed serious accuracy issues with popular AI chatbots when handling news content.
Study Overview
The research involved:
- 22 public media organizations from 18 countries
- 14 languages tested
- Over 3,000 AI-generated responses evaluated
- Professional journalists as assessors
- Four major AI models: ChatGPT, Copilot, Gemini, and Perplexity
Key Findings
The results are concerning:
- 45% of responses contained significant distortions
- 31% had severe attribution errors (misquoting sources or incorrect credits)
- 20% displayed accuracy problems (factual mistakes)
Gemini's Poor Performance
Google's Gemini performed worst, showing relevant failures in 76% of outputs—more than double the rate of other AI assistants tested.What This Means
This study, reported by the BBC, represents the largest evaluation of its kind regarding AI chatbots and news accuracy. The findings raise important questions about relying on AI tools for news consumption and fact-checking.
Users should verify information from AI assistants against trusted news sources, especially when accuracy matters.


