Mitigating Hallucination in Fictional Character Role-Play
- URL: http://arxiv.org/abs/2406.17260v2
- Date: Fri, 08 Nov 2024 23:11:36 GMT
- Title: Mitigating Hallucination in Fictional Character Role-Play
- Authors: Nafis Sadeq, Zhouhang Xie, Byungkyu Kang, Prarit Lamba, Xiang Gao, Julian McAuley,
- Abstract summary: We focus on the evaluation and mitigation of hallucination in fictional character role-play.
We introduce a dataset with over 2,000 characters and 72,000 interviews, including 18,000 adversarial questions.
We propose RoleFact, a role-playing method that mitigates hallucination by modulating the influence of parametric knowledge.
- Score: 19.705708068900076
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Role-playing has wide-ranging applications in customer support, embodied agents, and computational social science. The influence of parametric world knowledge of large language models (LLMs) often causes role-playing characters to act out of character and to hallucinate about things outside the scope of their knowledge. In this work, we focus on the evaluation and mitigation of hallucination in fictional character role-play. We introduce a dataset with over 2,000 characters and 72,000 interviews, including 18,000 adversarial questions. We propose RoleFact, a role-playing method that mitigates hallucination by modulating the influence of parametric knowledge using a pre-calibrated confidence threshold. Experiments show that the proposed method improves the factual precision of generated responses by 18% for adversarial questions with a 44% reduction in temporal hallucination for time-sensitive interviews. The code and the dataset are available at https://github.com/NafisSadeq/rolefact.git.
Related papers
- RoleBreak: Character Hallucination as a Jailbreak Attack in Role-Playing Systems [20.786294377706717]
Role-playing systems powered by large language models (LLMs) have become increasingly influential in emotional communication applications.
These systems are susceptible to character hallucinations, where the model deviates from predefined character roles and generates responses that are inconsistent with the intended persona.
This paper presents the first systematic analysis of character hallucination from an attack perspective, introducing the RoleBreak framework.
arXiv Detail & Related papers (2024-09-25T08:23:46Z) - LLM Internal States Reveal Hallucination Risk Faced With a Query [62.29558761326031]
Humans have a self-awareness process that allows us to recognize what we don't know when faced with queries.
This paper investigates whether Large Language Models can estimate their own hallucination risk before response generation.
By a probing estimator, we leverage LLM self-assessment, achieving an average hallucination estimation accuracy of 84.32% at run time.
arXiv Detail & Related papers (2024-07-03T17:08:52Z) - Capturing Minds, Not Just Words: Enhancing Role-Playing Language Models with Personality-Indicative Data [58.92110996840019]
We propose to enhance role-playing language models (RPLMs) via personality-indicative data.
Specifically, we leverage questions from psychological scales and distill advanced RPAs to generate dialogues that grasp the minds of characters.
Experimental results validate that RPLMs trained with our dataset exhibit advanced role-playing capabilities for both general and personality-related evaluations.
arXiv Detail & Related papers (2024-06-27T06:24:00Z) - TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models [55.51648393234699]
We introduce TimeChara, a new benchmark designed to evaluate point-in-time character hallucination in role-playing LLMs.
We propose Narrative-Experts, a method that decomposes the reasoning steps and utilizes narrative experts to reduce point-in-time character hallucinations effectively.
arXiv Detail & Related papers (2024-05-28T10:19:18Z) - Large Language Models are Superpositions of All Characters: Attaining
Arbitrary Role-play via Self-Alignment [62.898963074989766]
We introduce Ditto, a self-alignment method for role-play.
This method creates a role-play training set comprising 4,000 characters, surpassing the scale of currently available datasets by tenfold.
We present the first comprehensive cross-supervision alignment experiment in the role-play domain.
arXiv Detail & Related papers (2024-01-23T03:56:22Z) - A Comprehensive Survey of Hallucination Mitigation Techniques in Large
Language Models [7.705767540805267]
Large Language Models (LLMs) continue to advance in their ability to write human-like text.
A key challenge remains around their tendency to hallucinate generating content that appears factual but is ungrounded.
This paper presents a survey of over 32 techniques developed to mitigate hallucination in LLMs.
arXiv Detail & Related papers (2024-01-02T17:56:30Z) - HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data [102.56792377624927]
hallucinations inherent in machine-generated data remain under-explored.
We present a novel hallucination detection and elimination framework, HalluciDoctor, based on the cross-checking paradigm.
Our method successfully mitigates 44.6% hallucinations relatively and maintains competitive performance compared to LLaVA.
arXiv Detail & Related papers (2023-11-22T04:52:58Z) - Better Zero-Shot Reasoning with Role-Play Prompting [10.90357246745529]
Role-play prompting consistently surpasses the standard zero-shot approach across most datasets.
This highlights its potential to augment the reasoning capabilities of large language models.
arXiv Detail & Related papers (2023-08-15T11:08:30Z) - Personality Understanding of Fictional Characters during Book Reading [81.68515671674301]
We present the first labeled dataset PersoNet for this problem.
Our novel annotation strategy involves annotating user notes from online reading apps as a proxy for the original books.
Experiments and human studies indicate that our dataset construction is both efficient and accurate.
arXiv Detail & Related papers (2023-05-17T12:19:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.