NewsGuard: AI chatbots spread falsehoods in 33% of answers

A recent study by NewsGuard reveals that leading AI chatbots, including those from OpenAI and Meta, are providing false information in approximately one out of every three responses. The report underscores a concerning trend: chatbots are increasingly likely to fabricate answers rather than admit a lack of information, resulting in a higher rate of falsehoods compared to 2024.

NewsGuard, a US-based news rating company, assessed the accuracy of responses from the ten most popular AI chatbots, highlighting a significant challenge in maintaining the reliability of these increasingly prevalent tools.

Chatbot accuracy: Ranking the platforms

The NewsGuard report identifies specific chatbots with varying degrees of accuracy. Inflection AI’s Pi chatbot exhibited the highest rate of false claims, with 57% of its answers containing inaccurate information. Perplexity AI followed closely, with 47% of responses deemed false.

More widely used chatbots like OpenAI’s ChatGPT and Meta’s Llama also demonstrated notable error rates, disseminating falsehoods in 40% of their answers. Microsoft’s Copilot and Mistral’s Le Chat presented error rates around the average of 35%.

In contrast, Anthropic’s Claude and Google’s Gemini displayed the lowest fail rates. Claude produced falsehoods in only 10% of its responses, while Gemini had a 17% error rate.

Perplexity AI experienced the most significant decline in accuracy. In 2024, NewsGuard’s research found no false claims in its responses. However, by August 2025, the rate of false claims had surged to 46%. The report does not definitively explain this decline, but it notes user complaints on a dedicated Reddit forum as a potential indicator of the issues.

Mistral, a French AI company, showed no change in its falsehood rate since 2024, maintaining a consistent 37% error rate.

These findings are consistent with a previous report by French newspaper Les Echos, which discovered that Mistral repeated false information about France, President Emmanuel Macron, and First Lady Brigitte Macron in 58% of English responses and 31% of French responses. Mistral attributed these issues to its Le Chat assistants, both those connected to web search and those operating independently.

Euronews Next reached out to the companies mentioned in the NewsGuard report but did not receive an immediate response.

The influence of disinformation

The NewsGuard report also revealed that certain chatbots are citing sources linked to Russian disinformation campaigns, such as Storm-1516 and Pravda, in their responses. These campaigns are known for creating and disseminating false news.

One example cited in the report involves a claim that Igor Grosu, the leader of the Moldovan Parliament, “likened Moldovans to a ‘flock of sheep.'” NewsGuard identified this claim as a fabricated news report that imitated Romanian news outlet Digi24 and used AI-generated audio in Grosu’s voice.

Mistral, Claude, Inflection’s Pi, Copilot, Meta, and Perplexity all repeated this claim as fact, with several of them citing Pravda network sites as their sources.

These findings are particularly concerning given recent announcements and partnerships aimed at enhancing the safety and accuracy of AI models.

OpenAI, for example, has claimed that its latest ChatGPT-5 model is “hallucination-proof,” meaning it should not generate fabricated answers. Similarly, Google announced that Gemini 2.5 is “capable of reasoning through their thoughts before responding, resulting in enhanced performance and improved accuracy.”

Despite these claims, the NewsGuard report concludes that AI models “continue to fail in the same areas they did a year ago,” highlighting the ongoing challenges in ensuring the reliability of these systems.

Methodology of the study

To conduct its study, NewsGuard evaluated the responses of chatbots to ten false claims. Researchers used three different types of prompts: neutral prompts, leading prompts that assumed the false claim was true, and malicious prompts designed to circumvent safety measures.

The researchers then assessed whether the chatbot repeated the false claim or debunked it by refusing to answer.

The report concludes that AI models are “repeating falsehoods more often, stumbling into data voids where only the malign actors offer information, getting duped by foreign-linked websites posing as local outlets, and struggling with breaking news events,” than they were in 2024, underscoring the need for continued vigilance and improvement in the development and deployment of AI chatbots.

Tags: AI chatbots featured

NewsGuard: AI chatbots spread falsehoods in 33% of answers

Kerem Gülen

Related Posts

Ashley St. Clair sues xAI over Grok deepfakes

Google Gemini gains “proactive reasoning” across YouTube and Search history

Google launches revamped Trends Explore page with Gemini

Apple chose Google Gemini for Siri

LATEST

OpenAI rockets $250 million into Altman’s Merge Labs brain-AI bridge

Bluesky opens “Live Now” badges to all users to lure Twitch creators

Capcom reveals Resident Evil: Requiem classic mode and ink ribbons

How to tell if your iPhone or Android phone is carrier unlocked

Paramount+ slams subscribers with first price hike since 2024

Ashley St. Clair sues xAI over Grok deepfakes

Samsung launches instant-play cloud streaming in Mobile Gaming Hub update

Netflix secures Sony Pictures first-to-stream rights

How to apply screen protectors without air bubbles

How to check if someone read your message on iPhone or iPad

© 2021 TechBriefly is a Linkmedya brand.

NewsGuard: AI chatbots spread falsehoods in 33% of answers

Chatbot accuracy: Ranking the platforms

The influence of disinformation

Methodology of the study

Related Posts

LATEST

© 2021 TechBriefly is a Linkmedya brand.

Follow Us