Related papers: SELF-PERCEPT: Introspection Improves Large Language Models' Detection of Multi-Person Mental Manipulation in Conversations

SELF-PERCEPT: Introspection Improves Large Language Models' Detection of Multi-Person Mental Manipulation in Conversations

URL: http://arxiv.org/abs/2505.20679v1
Date: Tue, 27 May 2025 03:51:25 GMT
Title: SELF-PERCEPT: Introspection Improves Large Language Models' Detection of Multi-Person Mental Manipulation in Conversations
Authors: Danush Khanna, Pratinav Seth, Sidhaarth Sredharan Murali, Aditya Kumar Guru, Siddharth Shukla, Tanuj Tyagi, Sandeep Chaurasia, Kripabandhu Ghosh,
Abstract summary: Mental manipulation is a subtle yet pervasive form of abuse in interpersonal communication.<n>We introduce the MultiManip dataset, comprising 220 multi-turn, multi-person dialogues balanced between manipulative and non-manipulative interactions.<n>We propose SELF-PERCEPT, a novel, two-stage prompting framework inspired by Self-Perception Theory.
Score: 3.9207535345233873
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Mental manipulation is a subtle yet pervasive form of abuse in interpersonal communication, making its detection critical for safeguarding potential victims. However, due to manipulation's nuanced and context-specific nature, identifying manipulative language in complex, multi-turn, and multi-person conversations remains a significant challenge for large language models (LLMs). To address this gap, we introduce the MultiManip dataset, comprising 220 multi-turn, multi-person dialogues balanced between manipulative and non-manipulative interactions, all drawn from reality shows that mimic real-world scenarios. For manipulative interactions, it includes 11 distinct manipulations depicting real-life scenarios. We conduct extensive evaluations of state-of-the-art LLMs, such as GPT-4o and Llama-3.1-8B, employing various prompting strategies. Despite their capabilities, these models often struggle to detect manipulation effectively. To overcome this limitation, we propose SELF-PERCEPT, a novel, two-stage prompting framework inspired by Self-Perception Theory, demonstrating strong performance in detecting multi-person, multi-turn mental manipulation. Our code and data are publicly available at https://github.com/danushkhanna/self-percept .

Related papers

Seamless Interaction: Dyadic Audiovisual Motion Modeling and Large-Scale Dataset [113.25650486482762]
We introduce the Seamless Interaction dataset, a large-scale collection of over 4,000 hours of face-to-face interaction footage.<n>This dataset enables the development of AI technologies that understand dyadic embodied dynamics.<n>We develop a suite of models that utilize the dataset to generate dyadic motion gestures and facial expressions aligned with human speech.
arXiv Detail & Related papers (2025-06-27T18:09:49Z)
MentalMAC: Enhancing Large Language Models for Detecting Mental Manipulation via Multi-Task Anti-Curriculum Distillation [12.845142027768881]
Mental manipulation is a subtle yet pervasive form of psychological abuse that poses serious threats to mental health.<n>MentalMAC is a multi-task anti-curriculum distillation method that enhances LLMs' ability to detect mental manipulation in multi-turn dialogue.<n>Our approach includes: (i) EvoSA, an unsupervised data expansion method based on evolutionary operations and speech act theory; (ii) teacher model-generated multi-task supervision; and (iii) progressive knowledge distillation from complex to simpler tasks.
arXiv Detail & Related papers (2025-05-21T08:34:06Z)
Can LLM Agents Maintain a Persona in Discourse? [3.286711575862228]
Large Language Models (LLMs) are widely used as conversational agents, exploiting their capabilities in various sectors such as education, law, medicine, and more.<n>LLMs are often subjected to context-shifting behaviour, resulting in a lack of consistent and interpretable personality-aligned interactions.<n>We show that while LLMs can be guided toward personality-driven dialogue, their ability to maintain personality traits varies significantly depending on the combination of models and discourse settings.
arXiv Detail & Related papers (2025-02-17T14:36:39Z)
Beyond Sight: Finetuning Generalist Robot Policies with Heterogeneous Sensors via Language Grounding [85.63710017456792]
FuSe is a novel approach that enables finetuning visuomotor generalist policies on heterogeneous sensor modalities.<n>We show that FuSe enables performing challenging tasks that require reasoning jointly over modalities such as vision, touch, and sound.<n>Experiments in the real world show that FuSeis able to increase success rates by over 20% compared to all considered baselines.
arXiv Detail & Related papers (2025-01-08T18:57:33Z)
Visual Error Patterns in Multi-Modal AI: A Statistical Approach [0.0]
Multi-modal large language models (MLLMs) excel at integrating text and visual data but face systematic challenges when interpreting ambiguous or incomplete visual stimuli.<n>This study leverages statistical modeling to analyze the factors driving these errors, using a dataset of geometric stimuli characterized by features like 3D, rotation, and missing face/side.
arXiv Detail & Related papers (2024-11-27T01:20:08Z)
Who is Undercover? Guiding LLMs to Explore Multi-Perspective Team Tactic in the Game [3.8284679578037246]
We use the language logic game Who is Undercover?'' as an experimental platform to propose the Multi-Perspective Team Tactic (MPTT) framework. MPTT aims to cultivate LLMs' human-like language expression logic, multi-dimensional thinking, and self-perception in complex scenarios. Preliminary results show that MPTT, combined with WIU, leverages LLMs' cognitive capabilities to create a decision-making framework that can simulate real society.
arXiv Detail & Related papers (2024-10-20T06:41:31Z)
Multimodal Fusion with LLMs for Engagement Prediction in Natural Conversation [70.52558242336988]
We focus on predicting engagement in dyadic interactions by scrutinizing verbal and non-verbal cues, aiming to detect signs of disinterest or confusion. In this work, we collect a dataset featuring 34 participants engaged in casual dyadic conversations, each providing self-reported engagement ratings at the end of each conversation. We introduce a novel fusion strategy using Large Language Models (LLMs) to integrate multiple behavior modalities into a multimodal transcript''
arXiv Detail & Related papers (2024-09-13T18:28:12Z)
PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, a framework for better data construction and model tuning.<n>For insufficient data usage, we incorporate strategies such as Chain-of-Thought prompting and anti-induction.<n>For rigid behavior patterns, we design the tuning process and introduce automated DPO to enhance the specificity and dynamism of the models' personalities.
arXiv Detail & Related papers (2024-07-17T08:13:22Z)
MentalManip: A Dataset For Fine-grained Analysis of Mental Manipulation in Conversations [41.661208833153225]
Mental manipulation is a significant form of abuse in interpersonal conversations. Our study introduces a new dataset, named $rm Msmall entalMsmall anip$, which consists of $4,000$ annotated movie dialogues. This dataset enables a comprehensive analysis of mental manipulation, pinpointing both the techniques utilized for manipulation and the vulnerabilities targeted in victims.
arXiv Detail & Related papers (2024-05-26T14:27:48Z)
What if...?: Thinking Counterfactual Keywords Helps to Mitigate Hallucination in Large Multi-modal Models [50.97705264224828]
We propose Counterfactual Inception, a novel method that implants counterfactual thinking into Large Multi-modal Models. We aim for the models to engage with and generate responses that span a wider contextual scene understanding. Comprehensive analyses across various LMMs, including both open-source and proprietary models, corroborate that counterfactual thinking significantly reduces hallucination.
arXiv Detail & Related papers (2024-03-20T11:27:20Z)
ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation [22.071450379253235]
We introduce an innovative approach for robot manipulation that leverages the robust reasoning capabilities of Multimodal Large Language Models (MLLMs) By fine-tuning the injected adapters, we preserve the inherent common sense and reasoning ability of the MLLMs while equipping them with the ability for manipulation. Experiments in simulator and real-world show the promising performance of ManipLLM.
arXiv Detail & Related papers (2023-12-24T06:38:11Z)
Detecting and Grounding Multi-Modal Media Manipulation and Beyond [93.08116982163804]
We highlight a new research problem for multi-modal fake media, namely Detecting and Grounding Multi-Modal Media Manipulation (DGM4) DGM4 aims to not only detect the authenticity of multi-modal media, but also ground the manipulated content. We propose a novel HierArchical Multi-modal Manipulation rEasoning tRansformer (HAMMER) to fully capture the fine-grained interaction between different modalities.
arXiv Detail & Related papers (2023-09-25T15:05:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.