Towards A Holistic Landscape of Situated Theory of Mind in Large
Language Models
- URL: http://arxiv.org/abs/2310.19619v1
- Date: Mon, 30 Oct 2023 15:12:09 GMT
- Title: Towards A Holistic Landscape of Situated Theory of Mind in Large
Language Models
- Authors: Ziqiao Ma, Jacob Sansom, Run Peng, Joyce Chai
- Abstract summary: Large Language Models (LLMs) have generated considerable interest and debate regarding their potential emergence of Theory of Mind (ToM)
Several recent inquiries reveal a lack of robust ToM in these models and pose a pressing demand to develop new benchmarks.
We taxonomize machine ToM into 7 mental state categories and delineate existing benchmarks to identify under-explored aspects of ToM.
- Score: 14.491223187047378
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have generated considerable interest and debate
regarding their potential emergence of Theory of Mind (ToM). Several recent
inquiries reveal a lack of robust ToM in these models and pose a pressing
demand to develop new benchmarks, as current ones primarily focus on different
aspects of ToM and are prone to shortcuts and data leakage. In this position
paper, we seek to answer two road-blocking questions: (1) How can we taxonomize
a holistic landscape of machine ToM? (2) What is a more effective evaluation
protocol for machine ToM? Following psychological studies, we taxonomize
machine ToM into 7 mental state categories and delineate existing benchmarks to
identify under-explored aspects of ToM. We argue for a holistic and situated
evaluation of ToM to break ToM into individual components and treat LLMs as an
agent who is physically situated in environments and socially situated in
interactions with humans. Such situated evaluation provides a more
comprehensive assessment of mental states and potentially mitigates the risk of
shortcuts and data leakage. We further present a pilot study in a grid world
setup as a proof of concept. We hope this position paper can facilitate future
research to integrate ToM with LLMs and offer an intuitive means for
researchers to better position their work in the landscape of ToM. Project
page: https://github.com/Mars-tin/awesome-theory-of-mind
Related papers
- NegotiationToM: A Benchmark for Stress-testing Machine Theory of Mind on Negotiation Surrounding [55.38254464415964]
Theory of mind evaluations currently focuses on testing models using machine-generated data or game settings prone to shortcuts and spurious correlations.
We introduce NegotiationToM, a new benchmark designed to stress-test machine ToM in real-world negotiation surrounding covered multi-dimensional mental states.
arXiv Detail & Related papers (2024-04-21T11:51:13Z) - MMToM-QA: Multimodal Theory of Mind Question Answering [80.87550820953236]
Theory of Mind (ToM) is an essential ingredient for developing machines with human-level social intelligence.
Recent machine learning models, particularly large language models, seem to show some aspects of ToM understanding.
Human ToM, on the other hand, is more than video or text understanding.
People can flexibly reason about another person's mind based on conceptual representations extracted from any available data.
arXiv Detail & Related papers (2024-01-16T18:59:24Z) - Think Twice: Perspective-Taking Improves Large Language Models'
Theory-of-Mind Capabilities [63.90227161974381]
SimToM is a novel prompting framework inspired by Simulation Theory's notion of perspective-taking.
Our approach, which requires no additional training and minimal prompt-tuning, shows substantial improvement over existing methods.
arXiv Detail & Related papers (2023-11-16T22:49:27Z) - FANToM: A Benchmark for Stress-testing Machine Theory of Mind in
Interactions [94.61530480991627]
Theory of mind evaluations currently focus on testing models using passive narratives that inherently lack interactivity.
We introduce FANToM, a new benchmark designed to stress-test ToM within information-asymmetric conversational contexts via question answering.
arXiv Detail & Related papers (2023-10-24T00:24:11Z) - ToMChallenges: A Principle-Guided Dataset and Diverse Evaluation Tasks for Exploring Theory of Mind [3.9599054392856483]
We present ToMChallenges, a dataset for comprehensively evaluating the Theory of Mind based on the Sally-Anne and Smarties tests with a diverse set of tasks.
Our evaluation results and error analyses show that LLMs have inconsistent behaviors across prompts and tasks.
arXiv Detail & Related papers (2023-05-24T11:54:07Z) - Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in
Large Language Models [82.50173296858377]
Many anecdotal examples were used to suggest newer large language models (LLMs) like ChatGPT and GPT-4 exhibit Neural Theory-of-Mind (N-ToM)
We investigate the extent of LLMs' N-ToM through an extensive evaluation on 6 tasks and find that while LLMs exhibit certain N-ToM abilities, this behavior is far from being robust.
arXiv Detail & Related papers (2023-05-24T06:14:31Z) - A Review on Machine Theory of Mind [16.967933605635203]
Theory of Mind (ToM) is the ability to attribute mental states to others, the basis of human cognition.
In this paper, we review recent progress in machine ToM on beliefs, desires, and intentions.
arXiv Detail & Related papers (2023-03-21T04:58:47Z) - Few-Shot Character Understanding in Movies as an Assessment to
Meta-Learning of Theory-of-Mind [47.13015852330866]
Humans can quickly understand new fictional characters with a few observations, mainly by drawing analogies to fictional and real people they already know.
This reflects the few-shot and meta-learning essence of humans' inference of characters' mental states, i.e., theory-of-mind (ToM)
We fill this gap with a novel NLP dataset, ToM-in-AMC, the first assessment of machines' meta-learning of ToM in a realistic narrative understanding scenario.
arXiv Detail & Related papers (2022-11-09T05:06:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.