Related papers: Agent-to-Agent Theory of Mind: Testing Interlocutor Awareness among Large Language Models

Agent-to-Agent Theory of Mind: Testing Interlocutor Awareness among Large Language Models

URL: http://arxiv.org/abs/2506.22957v1
Date: Sat, 28 Jun 2025 17:22:59 GMT
Title: Agent-to-Agent Theory of Mind: Testing Interlocutor Awareness among Large Language Models
Authors: Younwoo Choi, Changling Li, Yongjin Yang, Zhijing Jin,
Abstract summary: Large language models (LLMs) are increasingly integrated into multi-agent and human-AI systems.<n>This paper formalizes the capacity to identify and adapt to the identity and characteristics of a dialogue partner.<n>We show that LLMs reliably identify same-family peers and certain prominent model families, such as GPT and Claude.
Score: 12.190536939842525
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: As large language models (LLMs) are increasingly integrated into multi-agent and human-AI systems, understanding their awareness of both self-context and conversational partners is essential for ensuring reliable performance and robust safety. While prior work has extensively studied situational awareness which refers to an LLM's ability to recognize its operating phase and constraints, it has largely overlooked the complementary capacity to identify and adapt to the identity and characteristics of a dialogue partner. In this paper, we formalize this latter capability as interlocutor awareness and present the first systematic evaluation of its emergence in contemporary LLMs. We examine interlocutor inference across three dimensions-reasoning patterns, linguistic style, and alignment preferences-and show that LLMs reliably identify same-family peers and certain prominent model families, such as GPT and Claude. To demonstrate its practical significance, we develop three case studies in which interlocutor awareness both enhances multi-LLM collaboration through prompt adaptation and introduces new alignment and safety vulnerabilities, including reward-hacking behaviors and increased jailbreak susceptibility. Our findings highlight the dual promise and peril of identity-sensitive behavior in LLMs, underscoring the need for further understanding of interlocutor awareness and new safeguards in multi-agent deployments. Our code is open-sourced at https://github.com/younwoochoi/InterlocutorAwarenessLLM.

Related papers

Agentic Reasoning for Large Language Models [122.81018455095999]
Reasoning is a fundamental cognitive process underlying inference, problem-solving, and decision-making.<n>Large language models (LLMs) demonstrate strong reasoning capabilities in closed-world settings, but struggle in open-ended and dynamic environments.<n>Agentic reasoning marks a paradigm shift by reframing LLMs as autonomous agents that plan, act, and learn through continual interaction.
arXiv Detail & Related papers (2026-01-18T18:58:23Z)
Generative Human-Object Interaction Detection via Differentiable Cognitive Steering of Multi-modal LLMs [85.69785384599827]
Human-object interaction (HOI) detection aims to localize human-object pairs and the interactions between them.<n>Existing methods operate under a closed-world assumption, treating the task as a classification problem over a small, predefined verb set.<n>We propose GRASP-HO, a novel Generative Reasoning And Steerable Perception framework that reformulates HOI detection from the closed-set classification task to the open-vocabulary generation problem.
arXiv Detail & Related papers (2025-12-19T14:41:50Z)
Communication and Verification in LLM Agents towards Collaboration under Information Asymmetry [17.472005826931127]
This paper studies Large Language Model (LLM) agents in task collaboration.<n>We extend Einstein Puzzles, a symbolic puzzle, to a table-top game.<n> Empirical results highlight the critical importance of aligned communication.
arXiv Detail & Related papers (2025-10-29T15:03:53Z)
Measuring and mitigating overreliance is necessary for building human-compatible AI [35.656767738427426]
We argue that measuring and mitigating overreliance must become central to large language models research and deployment.<n>First, we consolidate risks from overreliance at both the individual and societal levels, including high-stakes errors, governance challenges, and cognitive deskilling.<n>We propose mitigation strategies that the AI research community can pursue to ensure LLMs augment rather than undermine human capabilities.
arXiv Detail & Related papers (2025-09-08T16:15:07Z)
Teaching Language Models To Gather Information Proactively [53.85419549904644]
Large language models (LLMs) are increasingly expected to function as collaborative partners.<n>In this work, we introduce a new task paradigm: proactive information gathering.<n>We design a scalable framework that generates partially specified, real-world tasks, masking key information.<n>Within this setup, our core innovation is a reinforcement finetuning strategy that rewards questions that elicit genuinely new, implicit user information.
arXiv Detail & Related papers (2025-07-28T23:50:09Z)
Corrupted by Reasoning: Reasoning Language Models Become Free-Riders in Public Goods Games [87.5673042805229]
How large language models balance self-interest and collective well-being is a critical challenge for ensuring alignment, robustness, and safe deployment.<n>We adapt a public goods game with institutional choice from behavioral economics, allowing us to observe how different LLMs navigate social dilemmas.<n>Surprisingly, we find that reasoning LLMs, such as the o1 series, struggle significantly with cooperation.
arXiv Detail & Related papers (2025-06-29T15:02:47Z)
The Traitors: Deception and Trust in Multi-Agent Language Model Simulations [0.0]
We introduce The Traitors, a multi-agent simulation framework inspired by social deduction games.<n>We develop a suite of evaluation metrics capturing deception success, trust dynamics, and collective inference quality.<n>Our initial experiments across DeepSeek-V3, GPT-4o-mini, and GPT-4o (10 runs per model) reveal a notable asymmetry.
arXiv Detail & Related papers (2025-05-19T10:01:35Z)
Bridging Expertise Gaps: The Role of LLMs in Human-AI Collaboration for Cybersecurity [17.780795900414716]
This study investigates whether large language models (LLMs) can function as intelligent collaborators to bridge expertise gaps in cybersecurity decision-making.<n>We find that human-AI collaboration improves task performance,reducing false positives in phishing detection and false negatives in intrusion detection.
arXiv Detail & Related papers (2025-05-06T04:47:52Z)
A Survey of Frontiers in LLM Reasoning: Inference Scaling, Learning to Reason, and Agentic Systems [93.8285345915925]
Reasoning is a fundamental cognitive process that enables logical inference, problem-solving, and decision-making.<n>With the rapid advancement of large language models (LLMs), reasoning has emerged as a key capability that distinguishes advanced AI systems.<n>We categorize existing methods along two dimensions: (1) Regimes, which define the stage at which reasoning is achieved; and (2) Architectures, which determine the components involved in the reasoning process.
arXiv Detail & Related papers (2025-04-12T01:27:49Z)
Enhancing Multiple Dimensions of Trustworthiness in LLMs via Sparse Activation Control [44.326363467045496]
Large Language Models (LLMs) have become a critical area of research in Reinforcement Learning from Human Feedback (RLHF) representation engineering offers a new, training-free approach. This technique leverages semantic features to control the representation of LLM's intermediate hidden states. It is difficult to encode various semantic contents, like honesty and safety, into a singular semantic feature.
arXiv Detail & Related papers (2024-11-04T08:36:03Z)
Evaluating Cultural and Social Awareness of LLM Web Agents [113.49968423990616]
We introduce CASA, a benchmark designed to assess large language models' sensitivity to cultural and social norms.<n>Our approach evaluates LLM agents' ability to detect and appropriately respond to norm-violating user queries and observations.<n>Experiments show that current LLMs perform significantly better in non-agent environments.
arXiv Detail & Related papers (2024-10-30T17:35:44Z)
Concept Matching with Agent for Out-of-Distribution Detection [19.407364109506904]
We propose a new method that integrates the agent paradigm into out-of-distribution (OOD) detection task.<n>Our proposed method, Concept Matching with Agent (CMA), employs neutral prompts as agents to augment the CLIP-based OOD detection process.<n>Our extensive experimental results showcase the superior performance of CMA over both zero-shot and training-required methods.
arXiv Detail & Related papers (2024-05-27T02:27:28Z)
Rethinking ChatGPT's Success: Usability and Cognitive Behaviors Enabled by Auto-regressive LLMs' Prompting [5.344199202349884]
We analyze the structure of modalities within both two types of Large Language Models and six task-specific channels during deployment. We examine the stimulation of diverse cognitive behaviors in LLMs through the adoption of free-form text and verbal contexts.
arXiv Detail & Related papers (2024-05-17T00:19:41Z)
Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society of LLM Agents [101.17919953243107]
GovSim is a generative simulation platform designed to study strategic interactions and cooperative decision-making in large language models (LLMs)<n>We find that all but the most powerful LLM agents fail to achieve a sustainable equilibrium in GovSim, with the highest survival rate below 54%.<n>We show that agents that leverage "Universalization"-based reasoning, a theory of moral thinking, are able to achieve significantly better sustainability.
arXiv Detail & Related papers (2024-04-25T15:59:16Z)
FAC$^2$E: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition [56.76951887823882]
Large language models (LLMs) are primarily evaluated by overall performance on various text understanding and generation tasks. We present FAC$2$E, a framework for Fine-grAined and Cognition-grounded LLMs' Capability Evaluation.
arXiv Detail & Related papers (2024-02-29T21:05:37Z)
Don't Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration [39.603649838876294]
We study approaches to identify LLM knowledge gaps and abstain from answering questions when knowledge gaps are present. Motivated by their failures in self-reflection and over-reliance on held-out sets, we propose two novel approaches.
arXiv Detail & Related papers (2024-02-01T06:11:49Z)
AntEval: Evaluation of Social Interaction Competencies in LLM-Driven Agents [65.16893197330589]
Large Language Models (LLMs) have demonstrated their ability to replicate human behaviors across a wide range of scenarios. However, their capability in handling complex, multi-character social interactions has yet to be fully explored. We introduce the Multi-Agent Interaction Evaluation Framework (AntEval), encompassing a novel interaction framework and evaluation methods.
arXiv Detail & Related papers (2024-01-12T11:18:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.