Can ChatGPT Defend Its Answers? New Study Finds Weakness in AI Chatbot's Reasoning

ChatGPT has been gaining popularity since late 2022 due to its proficiency in providing human-like responses to inquiries. However, it is now facing scrutiny as a recent study from Ohio State University reveals a vulnerability in the chatbot's reasoning capabilities.

The study involved challenging large language models (LLMs), including ChatGPT, in debate-like conversations where users contested correct answers provided by the chatbot.

Reasoning of ChatGPT

After conducting experiments on various reasoning puzzles, including math, common sense, and logic, the study found a surprising weakness in ChatGPT's ability to defend its correct beliefs when faced with challenges.

Rather than robustly defending its accurate responses, the model often succumbed to invalid arguments presented by the users, sometimes even apologizing for its initial correct answer.

Boshi Wang, the lead author of the study and a PhD student in computer science and engineering at Ohio State, emphasized the importance of understanding whether the reasoning abilities of these generative AI tools are rooted in a deep understanding of truth or if they rely on memorized patterns to reach correct conclusions.

The study, presented at this week's 2023 Conference on Empirical Methods in Natural Language Processing in Singapore, used one ChatGPT to simulate a user challenging another ChatGPT.

The goal was to collaboratively reach the correct conclusion, mirroring how a human might interact with the model. The results were surprising, with ChatGPT being misled by users between 22% and 70% of the time across various benchmarks.

While newer versions of ChatGPT, like GPT-4, recorded lower failure rates, the study underscored that these models, despite their advanced reasoning abilities, are not infallible. Even when ChatGPT expressed confidence in its answers, the failure rates remained high, suggesting a systemic issue rather than mere uncertainty.

The study raised concerns about the reliability of AI models like ChatGPT, especially as they become more widespread and integral in various fields such as criminal justice and healthcare.

Xiang Yue, co-author of the study, emphasized the potential dangers of relying on AI models that can be easily deceived. Ensuring the safety of AI systems is crucial, particularly as they play increasingly vital roles in decision-making processes.

Does ChatGPT Lack Reasoning and Truth Understanding?

The study attributed the model's difficulty defending itself to a combination of factors, including the base model lacking reasoning and truth understanding and the influence of further alignment based on human feedback.

The training process, geared toward producing responses preferred by humans, might inadvertently lead the model to yield more easily to opposing views without maintaining a commitment to the truth.

While acknowledging the problems identified in AI models like ChatGPT, the study highlighted the challenge of finding effective solutions due to the black-box nature of large language models.

The researchers advocate for ongoing efforts to improve the safety and reliability of AI systems to mitigate potential risks associated with their widespread use.

"This problem could potentially become very severe, and we could just be overestimating these models' capabilities in really dealing with complex reasoning tasks," said Wang.

"Despite being able to find and identify its problems, right now we don't have very good ideas about how to solve them. There will be ways, but it's going to take time to get to those solutions," he added.

The study's findings were published in arXiv.