Related papers: TactfulToM: Do LLMs Have the Theory of Mind Ability to Understand White Lies?

TactfulToM: Do LLMs Have the Theory of Mind Ability to Understand White Lies?

URL: http://arxiv.org/abs/2509.17054v2
Date: Wed, 24 Sep 2025 18:47:31 GMT
Title: TactfulToM: Do LLMs Have the Theory of Mind Ability to Understand White Lies?
Authors: Yiwei Liu, Emma Jane Pretty, Jiahao Huang, Saku Sugawara,
Abstract summary: We introduce TactfulToM, a novel English benchmark designed to evaluate Large Language Models' (LLMs) ability to understand white lies within real-life conversations.<n>Our benchmark is generated through a multi-stage human-in-the-loop pipeline where LLMs expand manually designed seed stories into conversations to maintain the information asymmetry necessary for authentic white lies.<n>We show that TactfulToM is challenging for state-of-the-art models, which perform substantially below humans, revealing shortcomings in their ability to fully comprehend the ToM reasoning that enables true understanding of white lies.
Score: 13.075782848287487
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While recent studies explore Large Language Models' (LLMs) performance on Theory of Mind (ToM) reasoning tasks, research on ToM abilities that require more nuanced social context is limited, such as white lies. We introduce TactfulToM, a novel English benchmark designed to evaluate LLMs' ability to understand white lies within real-life conversations and reason about prosocial motivations behind them, particularly when they are used to spare others' feelings and maintain social harmony. Our benchmark is generated through a multi-stage human-in-the-loop pipeline where LLMs expand manually designed seed stories into conversations to maintain the information asymmetry between participants necessary for authentic white lies. We show that TactfulToM is challenging for state-of-the-art models, which perform substantially below humans, revealing shortcomings in their ability to fully comprehend the ToM reasoning that enables true understanding of white lies.

Related papers

Because we have LLMs, we Can and Should Pursue Agentic Interpretability [22.10895793309226]
Large Language Models (LLMs) proactively assist human understanding by developing and leveraging a mental model of the user.<n>Agentic interpretability introduces challenges, particularly in evaluation, due to what we call human-entangled-in-the-loop' nature.<n>Agentic interpretability's promise is to help humans learn the potentially deceptive concepts of the LLMs, rather than see us fall increasingly far from understanding them.
arXiv Detail & Related papers (2025-06-13T18:13:58Z)
XToM: Exploring the Multilingual Theory of Mind for Large Language Models [57.9821865189077]
Existing evaluations of Theory of Mind in LLMs are largely limited to English.<n>We present XToM, a rigorously validated multilingual benchmark that evaluates ToM across five languages.<n>Our findings expose limitations in LLMs' ability to replicate human-like mentalizing across linguistic contexts.
arXiv Detail & Related papers (2025-06-03T05:23:25Z)
How Deep is Love in LLMs' Hearts? Exploring Semantic Size in Human-like Cognition [75.11808682808065]
This study investigates whether large language models (LLMs) exhibit similar tendencies in understanding semantic size.<n>Our findings reveal that multi-modal training is crucial for LLMs to achieve more human-like understanding.<n> Lastly, we examine whether LLMs are influenced by attention-grabbing headlines with larger semantic sizes in a real-world web shopping scenario.
arXiv Detail & Related papers (2025-03-01T03:35:56Z)
The Essence of Contextual Understanding in Theory of Mind: A Study on Question Answering with Story Characters [67.61587661660852]
Theory-of-Mind (ToM) allows humans to understand and interpret the mental states of others.<n>In this paper, we verify the importance of comprehensive contextual understanding about personal backgrounds in ToM.<n>We introduce CharToM benchmark, comprising 1,035 ToM questions based on characters from classic novels.
arXiv Detail & Related papers (2025-01-03T09:04:45Z)
Do LLMs Exhibit Human-Like Reasoning? Evaluating Theory of Mind in LLMs for Open-Ended Responses [11.121931601655174]
Theory of Mind (ToM) reasoning entails recognizing that other individuals possess their own intentions, emotions, and thoughts. Large language models (LLMs) excel in tasks such as summarization, question answering, and translation. Despite advancements, the extent to which LLMs truly understand ToM reasoning remains inadequately explored in open-ended scenarios.
arXiv Detail & Related papers (2024-06-09T05:57:59Z)
LLM Theory of Mind and Alignment: Opportunities and Risks [0.0]
There is growing interest in whether large language models (LLMs) have theory of mind (ToM) This paper identifies key areas in which LLM ToM will show up in human:LLM interactions at individual and group levels. It lays out a broad spectrum of potential implications and suggests the most pressing areas for future research.
arXiv Detail & Related papers (2024-05-13T19:52:16Z)
Large Language Models: The Need for Nuance in Current Debates and a Pragmatic Perspective on Understanding [1.3654846342364308]
Large Language Models (LLMs) are unparalleled in their ability to generate grammatically correct, fluent text. This position paper critically assesses three points recurring in critiques of LLM capacities. We outline a pragmatic perspective on the issue of real' understanding and intentionality in LLMs.
arXiv Detail & Related papers (2023-10-30T15:51:04Z)
FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions [94.61530480991627]
Theory of mind evaluations currently focus on testing models using passive narratives that inherently lack interactivity. We introduce FANToM, a new benchmark designed to stress-test ToM within information-asymmetric conversational contexts via question answering.
arXiv Detail & Related papers (2023-10-24T00:24:11Z)
Large Language Models are In-Context Semantic Reasoners rather than Symbolic Reasoners [75.85554779782048]
Large Language Models (LLMs) have excited the natural language and machine learning community over recent years. Despite of numerous successful applications, the underlying mechanism of such in-context capabilities still remains unclear. In this work, we hypothesize that the learned textitsemantics of language tokens do the most heavy lifting during the reasoning process.
arXiv Detail & Related papers (2023-05-24T07:33:34Z)
Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs [77.88043871260466]
We show that one of today's largest language models lacks this kind of social intelligence out-of-the box. We conclude that person-centric NLP approaches might be more effective towards neural Theory of Mind.
arXiv Detail & Related papers (2022-10-24T14:58:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.