SHARP: Unlocking Interactive Hallucination via Stance Transfer in Role-Playing Agents
- URL: http://arxiv.org/abs/2411.07965v3
- Date: Mon, 16 Dec 2024 19:24:42 GMT
- Title: SHARP: Unlocking Interactive Hallucination via Stance Transfer in Role-Playing Agents
- Authors: Chuyi Kong, Ziyang Luo, Hongzhan Lin, Zhiyuan Fan, Yaxin Fan, Yuxi Sun, Jing Ma,
- Abstract summary: We propose a generalizable, explicit and effective paradigm to unlock the interactive patterns in diverse worldviews.<n>Specifically, we define the interactive hallucination based on stance transfer and construct a benchmark, SHARP, by extracting relations from a general commonsense knowledge graph.<n>Our findings explore the factors influencing these metrics and discuss the trade-off between blind loyalty to roles and adherence to facts in RPAs.
- Score: 12.990119925990477
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The advanced role-playing capabilities of Large Language Models (LLMs) have paved the way for developing Role-Playing Agents (RPAs). However, existing benchmarks in social interaction such as HPD and SocialBench have not investigated hallucination and face limitations like poor generalizability and implicit judgments for character fidelity. To address these issues, we propose a generalizable, explicit and effective paradigm to unlock the interactive patterns in diverse worldviews. Specifically, we define the interactive hallucination based on stance transfer and construct a benchmark, SHARP, by extracting relations from a general commonsense knowledge graph and leveraging the inherent hallucination properties of RPAs to simulate interactions across roles. Extensive experiments validate the effectiveness and stability of our paradigm. Our findings further explore the factors influencing these metrics and discuss the trade-off between blind loyalty to roles and adherence to facts in RPAs.
Related papers
- RoleBreak: Character Hallucination as a Jailbreak Attack in Role-Playing Systems [20.786294377706717]
Role-playing systems powered by large language models (LLMs) have become increasingly influential in emotional communication applications.
These systems are susceptible to character hallucinations, where the model deviates from predefined character roles and generates responses that are inconsistent with the intended persona.
This paper presents the first systematic analysis of character hallucination from an attack perspective, introducing the RoleBreak framework.
arXiv Detail & Related papers (2024-09-25T08:23:46Z) - Enhancing adversarial robustness in Natural Language Inference using explanations [41.46494686136601]
We cast the spotlight on the underexplored task of Natural Language Inference (NLI)
We validate the usage of natural language explanation as a model-agnostic defence strategy through extensive experimentation.
We research the correlation of widely used language generation metrics with human perception, in order for them to serve as a proxy towards robust NLI models.
arXiv Detail & Related papers (2024-09-11T17:09:49Z) - PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, integrating psychology-grounded principles of personality: social practice, consistency, and dynamic development.
We incorporate personality traits directly into the model parameters, enhancing the model's resistance to induction, promoting consistency, and supporting the dynamic evolution of personality.
arXiv Detail & Related papers (2024-07-17T08:13:22Z) - Iterative Utility Judgment Framework via LLMs Inspired by Relevance in Philosophy [66.95501113584541]
Utility and topical relevance are critical measures in information retrieval.
We propose an Iterative utiliTy judgmEnt fraMework to promote each step of the cycle of Retrieval-Augmented Generation.
arXiv Detail & Related papers (2024-06-17T07:52:42Z) - VALOR-EVAL: Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models [57.43276586087863]
Large Vision-Language Models (LVLMs) suffer from hallucination issues, wherein the models generate plausible-sounding but factually incorrect outputs.
Existing benchmarks are often limited in scope, focusing mainly on object hallucinations.
We introduce a multi-dimensional benchmark covering objects, attributes, and relations, with challenging images selected based on associative biases.
arXiv Detail & Related papers (2024-04-22T04:49:22Z) - SocialBench: Sociality Evaluation of Role-Playing Conversational Agents [85.6641890712617]
Large language models (LLMs) have advanced the development of various AI conversational agents.
SocialBench is the first benchmark designed to evaluate the sociality of role-playing conversational agents at both individual and group levels.
We find that agents excelling in individual level does not imply their proficiency in group level.
arXiv Detail & Related papers (2024-03-20T15:38:36Z) - Discovery of the Hidden World with Large Language Models [95.58823685009727]
This paper presents Causal representatiOn AssistanT (COAT) that introduces large language models (LLMs) to bridge the gap.
LLMs are trained on massive observations of the world and have demonstrated great capability in extracting key information from unstructured data.
COAT also adopts CDs to find causal relations among the identified variables as well as to provide feedback to LLMs to iteratively refine the proposed factors.
arXiv Detail & Related papers (2024-02-06T12:18:54Z) - AntEval: Evaluation of Social Interaction Competencies in LLM-Driven
Agents [65.16893197330589]
Large Language Models (LLMs) have demonstrated their ability to replicate human behaviors across a wide range of scenarios.
However, their capability in handling complex, multi-character social interactions has yet to be fully explored.
We introduce the Multi-Agent Interaction Evaluation Framework (AntEval), encompassing a novel interaction framework and evaluation methods.
arXiv Detail & Related papers (2024-01-12T11:18:00Z) - Interactive Autonomous Navigation with Internal State Inference and
Interactivity Estimation [58.21683603243387]
We propose three auxiliary tasks with relational-temporal reasoning and integrate them into the standard Deep Learning framework.
These auxiliary tasks provide additional supervision signals to infer the behavior patterns other interactive agents.
Our approach achieves robust and state-of-the-art performance in terms of standard evaluation metrics.
arXiv Detail & Related papers (2023-11-27T18:57:42Z) - Understanding Robust Overfitting from the Feature Generalization Perspective [61.770805867606796]
Adversarial training (AT) constructs robust neural networks by incorporating adversarial perturbations into natural data.
It is plagued by the issue of robust overfitting (RO), which severely damages the model's robustness.
In this paper, we investigate RO from a novel feature generalization perspective.
arXiv Detail & Related papers (2023-10-01T07:57:03Z) - Robust Stance Detection: Understanding Public Perceptions in Social Media [15.460495567765362]
stance detection identifies precise positions relative to a well-defined topic.
Traditional stance detection models often lag in performance when applied to new domains and topics.
A solution we present in this paper combines counterfactual data augmentation with contrastive learning to enhance the robustness of stance detection.
arXiv Detail & Related papers (2023-09-26T18:19:51Z) - Better Zero-Shot Reasoning with Role-Play Prompting [10.90357246745529]
Role-play prompting consistently surpasses the standard zero-shot approach across most datasets.
This highlights its potential to augment the reasoning capabilities of large language models.
arXiv Detail & Related papers (2023-08-15T11:08:30Z) - DiPlomat: A Dialogue Dataset for Situated Pragmatic Reasoning [89.92601337474954]
Pragmatic reasoning plays a pivotal role in deciphering implicit meanings that frequently arise in real-life conversations.
We introduce a novel challenge, DiPlomat, aiming at benchmarking machines' capabilities on pragmatic reasoning and situated conversational understanding.
arXiv Detail & Related papers (2023-06-15T10:41:23Z) - DIDER: Discovering Interpretable Dynamically Evolving Relations [14.69985920418015]
This paper introduces DIDER, Discovering Interpretable Dynamically Evolving Relations, a generic end-to-end interaction modeling framework with intrinsic interpretability.
We evaluate DIDER on both synthetic and real-world datasets.
arXiv Detail & Related papers (2022-08-22T20:55:56Z) - Co-Located Human-Human Interaction Analysis using Nonverbal Cues: A
Survey [71.43956423427397]
We aim to identify the nonverbal cues and computational methodologies resulting in effective performance.
This survey differs from its counterparts by involving the widest spectrum of social phenomena and interaction settings.
Some major observations are: the most often used nonverbal cue, computational method, interaction environment, and sensing approach are speaking activity, support vector machines, and meetings composed of 3-4 persons equipped with microphones and cameras, respectively.
arXiv Detail & Related papers (2022-07-20T13:37:57Z) - Harnessing Perceptual Adversarial Patches for Crowd Counting [92.79051296850405]
Crowd counting is vulnerable to adversarial examples in the physical world.
This paper proposes the Perceptual Adrial Patch (PAP) generation framework to learn the shared perceptual features between models.
arXiv Detail & Related papers (2021-09-16T13:51:39Z) - Interactions in information spread: quantification and interpretation
using stochastic block models [3.5450828190071655]
In social networks, users' behavior results from the people they interact with, news in their feed, or trending topics.
Here, we propose a new model, the Interactive Mixed Membership Block Model (IMMSBM), which investigates the role of interactions between entities.
In inference tasks, taking them into account leads to average relative changes with respect to non-interactive models of up to 150% in the probability of an outcome.
arXiv Detail & Related papers (2020-04-09T14:22:10Z) - On the Sensory Commutativity of Action Sequences for Embodied Agents [2.320417845168326]
We study perception for embodied agents under the mathematical formalism of group theory.
We introduce the Sensory Commutativity Probability criterion which measures how much an agent's degree of freedom affects the environment.
We empirically illustrate how SCP and the commutative properties of action sequences can be used to learn about objects in the environment.
arXiv Detail & Related papers (2020-02-13T16:58:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.