This is an alarm bell for people who blindly trust the answers received from AI chatbots. Google has published an assessment, which tells how accurately AI chatbots work. Using its recently launched FACTS Benchmark Suite, Google found that the factual accuracy of the most powerful AI models does not exceed 70 percent. In simple language, this means that AI chatbots give wrong answers almost one out of every three answers.
Gemini 3 Pro was the most accurate
In Google’s benchmark test, the company’s Gemini 3 Pro model performed the best with 69 percent accuracy. The models of OpenAI, Anthropic and Elon Musk’s company xAI etc. could not even reach this level. Gemini 2.5 Pro and ChatGPT-5 provided answers with 62 percent accuracy. Whereas Claude 4.5 Opus showed responses with 51 percent accuracy and Grok 4 with about 54 percent accuracy. Most of the AI models became weak in multimodal tasks and their accuracy dropped to less than 50 percent.
How does Google’s benchmark test work?
This benchmark of Google looks at the capabilities of AI models in a different way. In most of the tests, tasks like summarizing text or writing code are done by the AI model, but in the FACTS benchmark, the model is asked how much truth is there in the information given by it. This model works on 4 practical use cases. The first test sees whether the model can provide factual answers just from the data consumed during training. The second test looks at the search performance of the model, the third test looks at how dependent the model is on the given document to capture new and additional details, and the fourth tests its multimodal understanding such as its ability to understand charts, diagrams and images.
Read this also-
Danger of cyber attack on Apple users, government issued warning, this work has to be done immediately

