Related papers: Think Twice: Perspective-Taking Improves Large Language Models' Theory-of-Mind Capabilities

Think Twice: Perspective-Taking Improves Large Language Models' Theory-of-Mind Capabilities

URL: http://arxiv.org/abs/2311.10227v1
Date: Thu, 16 Nov 2023 22:49:27 GMT
Title: Think Twice: Perspective-Taking Improves Large Language Models' Theory-of-Mind Capabilities
Authors: Alex Wilf, Sihyun Shawn Lee, Paul Pu Liang, Louis-Philippe Morency
Abstract summary: SimToM is a novel prompting framework inspired by Simulation Theory's notion of perspective-taking. Our approach, which requires no additional training and minimal prompt-tuning, shows substantial improvement over existing methods.
Score: 63.90227161974381
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Human interactions are deeply rooted in the interplay of thoughts, beliefs, and desires made possible by Theory of Mind (ToM): our cognitive ability to understand the mental states of ourselves and others. Although ToM may come naturally to us, emulating it presents a challenge to even the most advanced Large Language Models (LLMs). Recent improvements to LLMs' reasoning capabilities from simple yet effective prompting techniques such as Chain-of-Thought have seen limited applicability to ToM. In this paper, we turn to the prominent cognitive science theory "Simulation Theory" to bridge this gap. We introduce SimToM, a novel two-stage prompting framework inspired by Simulation Theory's notion of perspective-taking. To implement this idea on current ToM benchmarks, SimToM first filters context based on what the character in question knows before answering a question about their mental state. Our approach, which requires no additional training and minimal prompt-tuning, shows substantial improvement over existing methods, and our analysis reveals the importance of perspective-taking to Theory-of-Mind capabilities. Our findings suggest perspective-taking as a promising direction for future research into improving LLMs' ToM capabilities.

Related papers

EnigmaToM: Improve LLMs' Theory-of-Mind Reasoning Capabilities with Neural Knowledge Base of Entity States [15.557449564031975]
Theory-of-Mind (ToM) is fundamental to human interaction but remains a challenging task for Large Language Models (LLMs) We present EnigmaToM, a novel neuro-symbolic framework that enhances ToM reasoning by integrating a Neural Knowledge Base of entity states (Enigma) Experimental results on multiple benchmarks, including ToMi, HiToM, and FANToM, show that EnigmaToM significantly improves ToM reasoning across LLMs of varying sizes.
arXiv Detail & Related papers (2025-03-05T10:13:05Z)
Re-evaluating Theory of Mind evaluation in large language models [3.262532929657758]
We take inspiration from cognitive science to re-evaluate the state of ToM evaluation in large language models. A major reason for the disagreement on whether LLMs have ToM is a lack of clarity on whether models should be expected to match human behaviors. We conclude by discussing several directions for future research, including the relationship between ToM and pragmatic communication.
arXiv Detail & Related papers (2025-02-28T14:36:57Z)
PersuasiveToM: A Benchmark for Evaluating Machine Theory of Mind in Persuasive Dialogues [27.231701486961917]
We propose PersuasiveToM, a benchmark designed to evaluate the Theory of Mind abilities of Large Language Models.<n>Our framework contains two core tasks: ToM Reasoning and ToM Application.<n>Our aim with PersuasiveToM is to allow an effective evaluation of the ToM reasoning ability of LLMs with more focus on complex psychological activities.
arXiv Detail & Related papers (2025-02-28T13:04:04Z)
Decompose-ToM: Enhancing Theory of Mind Reasoning in Large Language Models through Simulation and Task Decomposition [2.089191490381739]
Theory of Mind (ToM) is the ability to understand and reflect on the mental states of others. Large Language Models (LLMs) possess only a rudimentary understanding of ToM. We propose Decompose-ToM'': an LLM-based inference algorithm that improves model performance on complex ToM tasks.
arXiv Detail & Related papers (2025-01-15T18:44:01Z)
Imagine while Reasoning in Space: Multimodal Visualization-of-Thought [70.74453180101365]
Chain-of-Thought (CoT) prompting has proven highly effective for enhancing complex reasoning in Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) We propose a new reasoning paradigm, Multimodal Visualization-of-Thought (MVoT) It enables visual thinking in MLLMs by generating image visualizations of their reasoning traces.
arXiv Detail & Related papers (2025-01-13T18:23:57Z)
Perceptions to Beliefs: Exploring Precursory Inferences for Theory of Mind in Large Language Models [51.91448005607405]
We evaluate key human ToM precursors by annotating characters' perceptions on ToMi and FANToM. We present PercepToM, a novel ToM method leveraging LLMs' strong perception inference capability while supplementing their limited perception-to-belief inference.
arXiv Detail & Related papers (2024-07-08T14:58:29Z)
Zero, Finite, and Infinite Belief History of Theory of Mind Reasoning in Large Language Models [5.455744338342196]
Large Language Models (LLMs) have recently shown a promise and emergence of Theory of Mind (ToM) ability. We propose a novel concept, taxonomy, and framework, the ToM reasoning with Zero, Finite, and Infinite Belief History. We have evaluated six LLMs with this game and found their performance on Zero Belief History is consistently better than on Finite Belief History.
arXiv Detail & Related papers (2024-06-07T10:04:39Z)
NegotiationToM: A Benchmark for Stress-testing Machine Theory of Mind on Negotiation Surrounding [55.38254464415964]
Theory of mind evaluations currently focuses on testing models using machine-generated data or game settings prone to shortcuts and spurious correlations. We introduce NegotiationToM, a new benchmark designed to stress-test machine ToM in real-world negotiation surrounding covered multi-dimensional mental states.
arXiv Detail & Related papers (2024-04-21T11:51:13Z)
What if...?: Thinking Counterfactual Keywords Helps to Mitigate Hallucination in Large Multi-modal Models [50.97705264224828]
We propose Counterfactual Inception, a novel method that implants counterfactual thinking into Large Multi-modal Models. We aim for the models to engage with and generate responses that span a wider contextual scene understanding. Comprehensive analyses across various LMMs, including both open-source and proprietary models, corroborate that counterfactual thinking significantly reduces hallucination.
arXiv Detail & Related papers (2024-03-20T11:27:20Z)
Towards A Holistic Landscape of Situated Theory of Mind in Large Language Models [14.491223187047378]
Large Language Models (LLMs) have generated considerable interest and debate regarding their potential emergence of Theory of Mind (ToM) Several recent inquiries reveal a lack of robust ToM in these models and pose a pressing demand to develop new benchmarks. We taxonomize machine ToM into 7 mental state categories and delineate existing benchmarks to identify under-explored aspects of ToM.
arXiv Detail & Related papers (2023-10-30T15:12:09Z)
HI-TOM: A Benchmark for Evaluating Higher-Order Theory of Mind Reasoning in Large Language Models [31.831042765744204]
Theory of Mind (ToM) is the ability to reason about one's own and others' mental states. We introduce HI-TOM, a Higher Order Theory of Mind benchmark. Our experimental evaluation using various Large Language Models (LLMs) indicates a decline in performance on higher-order ToM tasks.
arXiv Detail & Related papers (2023-10-25T16:41:15Z)
FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions [94.61530480991627]
Theory of mind evaluations currently focus on testing models using passive narratives that inherently lack interactivity. We introduce FANToM, a new benchmark designed to stress-test ToM within information-asymmetric conversational contexts via question answering.
arXiv Detail & Related papers (2023-10-24T00:24:11Z)
Machine Psychology [54.287802134327485]
We argue that a fruitful direction for research is engaging large language models in behavioral experiments inspired by psychology. We highlight theoretical perspectives, experimental paradigms, and computational analysis techniques that this approach brings to the table. It paves the way for a "machine psychology" for generative artificial intelligence (AI) that goes beyond performance benchmarks.
arXiv Detail & Related papers (2023-03-24T13:24:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.