Enhancing Conversational Agents with Theory of Mind: Aligning Beliefs, Desires, and Intentions for Human-Like Interaction
- URL: http://arxiv.org/abs/2502.14171v4
- Date: Wed, 05 Mar 2025 01:41:45 GMT
- Title: Enhancing Conversational Agents with Theory of Mind: Aligning Beliefs, Desires, and Intentions for Human-Like Interaction
- Authors: Mehdi Jafari, Devin Yuncheng Hua, Hao Xue, Flora Salim,
- Abstract summary: Open source language models (LLaMA) can capture and preserve ToM related information.<n>We investigate whether explicit manipulation of ToM related components, such as beliefs, desires, and intentions, can enhance response alignment.
- Score: 1.7311858910424738
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Natural language interaction with agentic Artificial Intelligence (AI), driven by Large Language Models (LLMs), is expected to remain a dominant paradigm in the near future. While humans instinctively align their communication with mental states -- an ability known as Theory of Mind (ToM), current LLM powered systems exhibit significant limitations in this regard. This study examines the extent to which open source language models (LLaMA) can capture and preserve ToM related information and how effectively it contributes to consistent ToM reasoning in generated responses. We further investigate whether explicit manipulation of ToM related components, such as beliefs, desires, and intentions, can enhance response alignment. Experiments on two LLaMA 3 variants demonstrate that incorporating ToM informed alignment improves response quality, achieving win rates of 67 and 63 percent for the 3B and 8B models, respectively. These findings highlight the potential of ToM driven strategies to improve alignment in LLM based conversational agents.
Related papers
- EnigmaToM: Improve LLMs' Theory-of-Mind Reasoning Capabilities with Neural Knowledge Base of Entity States [15.557449564031975]
Theory-of-Mind (ToM) is fundamental to human interaction but remains a challenging task for Large Language Models (LLMs)
We present EnigmaToM, a novel neuro-symbolic framework that enhances ToM reasoning by integrating a Neural Knowledge Base of entity states (Enigma)
Experimental results on multiple benchmarks, including ToMi, HiToM, and FANToM, show that EnigmaToM significantly improves ToM reasoning across LLMs of varying sizes.
arXiv Detail & Related papers (2025-03-05T10:13:05Z) - Re-evaluating Theory of Mind evaluation in large language models [3.262532929657758]
We take inspiration from cognitive science to re-evaluate the state of ToM evaluation in large language models.
A major reason for the disagreement on whether LLMs have ToM is a lack of clarity on whether models should be expected to match human behaviors.
We conclude by discussing several directions for future research, including the relationship between ToM and pragmatic communication.
arXiv Detail & Related papers (2025-02-28T14:36:57Z) - Large Language Models as Theory of Mind Aware Generative Agents with Counterfactual Reflection [31.38516078163367]
ToM-agent is designed to empower LLMs-based generative agents to simulate ToM in open-domain conversational interactions.<n>ToM-agent disentangles the confidence from mental states, facilitating the emulation of an agent's perception of its counterpart's mental states.<n>Our findings indicate that the ToM-agent can grasp the underlying reasons for their counterpart's behaviors beyond mere semantic-emotional supporting or decision-making based on common sense.
arXiv Detail & Related papers (2025-01-26T00:32:38Z) - Perceptions to Beliefs: Exploring Precursory Inferences for Theory of Mind in Large Language Models [51.91448005607405]
We evaluate key human ToM precursors by annotating characters' perceptions on ToMi and FANToM.
We present PercepToM, a novel ToM method leveraging LLMs' strong perception inference capability while supplementing their limited perception-to-belief inference.
arXiv Detail & Related papers (2024-07-08T14:58:29Z) - MARS: Benchmarking the Metaphysical Reasoning Abilities of Language Models with a Multi-task Evaluation Dataset [50.36095192314595]
Large Language Models (LLMs) function as conscious agents with generalizable reasoning capabilities.
This ability remains underexplored due to the complexity of modeling infinite possible changes in an event.
We introduce the first-ever benchmark, MARS, comprising three tasks corresponding to each step.
arXiv Detail & Related papers (2024-06-04T08:35:04Z) - What if...?: Thinking Counterfactual Keywords Helps to Mitigate Hallucination in Large Multi-modal Models [50.97705264224828]
We propose Counterfactual Inception, a novel method that implants counterfactual thinking into Large Multi-modal Models.
We aim for the models to engage with and generate responses that span a wider contextual scene understanding.
Comprehensive analyses across various LMMs, including both open-source and proprietary models, corroborate that counterfactual thinking significantly reduces hallucination.
arXiv Detail & Related papers (2024-03-20T11:27:20Z) - Comparing Rationality Between Large Language Models and Humans: Insights and Open Questions [6.201550639431176]
This paper focuses on the burgeoning prominence of large language models (LLMs)
We underscore the pivotal role of Reinforcement Learning from Human Feedback (RLHF) in augmenting LLMs' rationality and decision-making prowess.
arXiv Detail & Related papers (2024-03-14T18:36:04Z) - ToMBench: Benchmarking Theory of Mind in Large Language Models [41.565202027904476]
ToM is the cognitive capability to perceive and ascribe mental states to oneself and others.<n>Existing ToM evaluations are hindered by challenges such as constrained scope, subjective judgment, and unintended contamination.<n>We introduce ToMBench with three key characteristics: a systematic evaluation framework encompassing 8 tasks and 31 abilities in social cognition, a multiple-choice question format to support automated and unbiased evaluation, and a build-from-scratch bilingual inventory to strictly avoid data leakage.
arXiv Detail & Related papers (2024-02-23T02:05:46Z) - Think Twice: Perspective-Taking Improves Large Language Models'
Theory-of-Mind Capabilities [63.90227161974381]
SimToM is a novel prompting framework inspired by Simulation Theory's notion of perspective-taking.
Our approach, which requires no additional training and minimal prompt-tuning, shows substantial improvement over existing methods.
arXiv Detail & Related papers (2023-11-16T22:49:27Z) - MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and Collaboration [98.18244218156492]
Large Language Models (LLMs) have significantly advanced natural language processing.<n>As their applications expand into multi-agent environments, there arises a need for a comprehensive evaluation framework.<n>This work introduces a novel competition-based benchmark framework to assess LLMs within multi-agent settings.
arXiv Detail & Related papers (2023-11-14T21:46:27Z) - Unveiling Theory of Mind in Large Language Models: A Parallel to Single
Neurons in the Human Brain [2.5350521110810056]
Large language models (LLMs) have been found to exhibit a certain level of Theory of Mind (ToM)
The precise processes underlying LLM's capacity for ToM or their similarities with that of humans remains largely unknown.
arXiv Detail & Related papers (2023-09-04T15:26:15Z) - Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in
Large Language Models [82.50173296858377]
Many anecdotal examples were used to suggest newer large language models (LLMs) like ChatGPT and GPT-4 exhibit Neural Theory-of-Mind (N-ToM)
We investigate the extent of LLMs' N-ToM through an extensive evaluation on 6 tasks and find that while LLMs exhibit certain N-ToM abilities, this behavior is far from being robust.
arXiv Detail & Related papers (2023-05-24T06:14:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.