Does ChatGPT have Theory of Mind?
- URL: http://arxiv.org/abs/2305.14020v2
- Date: Wed, 13 Sep 2023 11:22:19 GMT
- Title: Does ChatGPT have Theory of Mind?
- Authors: Bart Holterman and Kees van Deemter
- Abstract summary: Theory of Mind (ToM) is the ability to understand human thinking and decision-making.
This paper investigates what extent recent Large Language Models in the ChatGPT tradition possess ToM.
- Score: 2.3129337924262927
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Theory of Mind (ToM) is the ability to understand human thinking and
decision-making, an ability that plays a crucial role in social interaction
between people, including linguistic communication. This paper investigates to
what extent recent Large Language Models in the ChatGPT tradition possess ToM.
We posed six well-known problems that address biases in human reasoning and
decision making to two versions of ChatGPT and we compared the results under a
range of prompting strategies. While the results concerning ChatGPT-3 were
somewhat inconclusive, ChatGPT-4 was shown to arrive at the correct answers
more often than would be expected based on chance, although correct answers
were often arrived at on the basis of false assumptions or invalid reasoning.
Related papers
- Complementary Advantages of ChatGPTs and Human Readers in Reasoning:
Evidence from English Text Reading Comprehension [12.240611073541597]
ChatGPT has shown its great power in text processing, including its reasoning ability from text reading.
There has not been any direct comparison between human readers and ChatGPT in reasoning ability related to text reading.
This study was undertaken to investigate how ChatGPTs and Chinese senior school students exhibited their reasoning ability from English narrative texts.
arXiv Detail & Related papers (2023-11-17T06:13:02Z) - Primacy Effect of ChatGPT [69.49920102917598]
We study the primacy effect of ChatGPT: the tendency of selecting the labels at earlier positions as the answer.
We hope that our experiments and analyses provide additional insights into building more reliable ChatGPT-based solutions.
arXiv Detail & Related papers (2023-10-20T00:37:28Z) - Performance of ChatGPT on USMLE: Unlocking the Potential of Large
Language Models for AI-Assisted Medical Education [0.0]
This study determined how reliable ChatGPT can be for answering complex medical and clinical questions.
The paper evaluated the obtained results using a 2-way ANOVA and posthoc analysis.
ChatGPT-generated answers were found to be more context-oriented than regular Google search results.
arXiv Detail & Related papers (2023-06-30T19:53:23Z) - ChatGPT is a Knowledgeable but Inexperienced Solver: An Investigation of Commonsense Problem in Large Language Models [49.52083248451775]
Large language models (LLMs) have made significant progress in NLP.
We specifically focus on ChatGPT, a widely used and easily accessible LLM.
We conduct a series of experiments on 11 datasets to evaluate ChatGPT's commonsense abilities.
arXiv Detail & Related papers (2023-03-29T03:05:43Z) - Consistency Analysis of ChatGPT [65.268245109828]
This paper investigates the trustworthiness of ChatGPT and GPT-4 regarding logically consistent behaviour.
Our findings suggest that while both models appear to show an enhanced language understanding and reasoning ability, they still frequently fall short of generating logically consistent predictions.
arXiv Detail & Related papers (2023-03-11T01:19:01Z) - Can ChatGPT Understand Too? A Comparative Study on ChatGPT and
Fine-tuned BERT [103.57103957631067]
ChatGPT has attracted great attention, as it can generate fluent and high-quality responses to human inquiries.
We evaluate ChatGPT's understanding ability by evaluating it on the most popular GLUE benchmark, and comparing it with 4 representative fine-tuned BERT-style models.
We find that: 1) ChatGPT falls short in handling paraphrase and similarity tasks; 2) ChatGPT outperforms all BERT models on inference tasks by a large margin; 3) ChatGPT achieves comparable performance compared with BERT on sentiment analysis and question answering tasks.
arXiv Detail & Related papers (2023-02-19T12:29:33Z) - Is ChatGPT a General-Purpose Natural Language Processing Task Solver? [113.22611481694825]
Large language models (LLMs) have demonstrated the ability to perform a variety of natural language processing (NLP) tasks zero-shot.
Recently, the debut of ChatGPT has drawn a great deal of attention from the natural language processing (NLP) community.
It is not yet known whether ChatGPT can serve as a generalist model that can perform many NLP tasks zero-shot.
arXiv Detail & Related papers (2023-02-08T09:44:51Z) - A Categorical Archive of ChatGPT Failures [47.64219291655723]
ChatGPT, developed by OpenAI, has been trained using massive amounts of data and simulates human conversation.
It has garnered significant attention due to its ability to effectively answer a broad range of human inquiries.
However, a comprehensive analysis of ChatGPT's failures is lacking, which is the focus of this study.
arXiv Detail & Related papers (2023-02-06T04:21:59Z) - How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation,
and Detection [8.107721810172112]
ChatGPT is able to respond effectively to a wide range of human questions.
People are starting to worry about the potential negative impacts that large language models (LLMs) like ChatGPT could have on society.
In this work, we collected tens of thousands of comparison responses from both human experts and ChatGPT.
arXiv Detail & Related papers (2023-01-18T15:23:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.