Science

Study: ChatGPT shows high error rate in scientific fact-checking

A Washington State University study found ChatGPT-3.5 incorrectly evaluated scientific hypotheses over 20% of the time, raising reliability concerns.

Image from news.wsu.edu

Image: news.wsu.edu

A recent study from Washington State University has quantified significant inaccuracies in ChatGPT's ability to fact-check scientific claims. Led by Professor Mesut Cicek, researchers tested the AI model by having it evaluate whether specific hypotheses from published scientific papers were supported by subsequent research.

The study, published in the journal 'Information Services & Use', found that ChatGPT-3.5 provided incorrect evaluations for more than one in five (over 20%) of the scientific statements it was asked to assess. The errors included both false positives, where it incorrectly stated a hypothesis was supported, and false negatives, where it failed to recognize supported findings.

Researchers noted the AI's responses were often inconsistent, providing different answers to the same question when asked repeatedly. This inconsistency and the high error rate highlight potential risks in relying on current-generation large language models for accurate scientific verification without human oversight.

The study's authors emphasize that while AI tools like ChatGPT are powerful for generating text and ideas, their use for factual validation, especially in specialized fields like science, requires caution and critical evaluation by domain experts.

📰 Original source: news.wsu.edu Read original →
Share: