Related papers: OlaMind: Towards Human-Like and Hallucination-Safe Customer Service for Retrieval-Augmented Dialogue

OlaMind: Towards Human-Like and Hallucination-Safe Customer Service for Retrieval-Augmented Dialogue

URL: http://arxiv.org/abs/2510.22143v1
Date: Sat, 25 Oct 2025 03:29:55 GMT
Title: OlaMind: Towards Human-Like and Hallucination-Safe Customer Service for Retrieval-Augmented Dialogue
Authors: Tianhong Gao, Jundong Shen, Bei Shi, Jiapeng Wang, Ying Ju, Junfeng Yao, Jiao Ran, Yong Zhang, Lin Dong, Huiyu Yu, Tingting Ye,
Abstract summary: We introduce OlaMind, a human-like and hallucination-safe framework for retrieval-augmented dialogue.<n>Our method significantly enhances human-likeness and naturalness while effectively mitigating hallucinations and critical business risks.
Score: 24.141708335708387
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Intelligent customer service (ICS) systems via retrieval-augmented generation (RAG) have been widely adopted in Web-based domains such as social platforms and e-commerce, achieving remarkable improvements in automation and efficiency. However, notable limitations still remain: these systems are prone to hallucinations and often generate rigid, mechanical responses, which can introduce business risks and undermine user experience, especially in Web-based customer service interactions under the RAG scenarios. In this paper, we introduce OlaMind, a human-like and hallucination-safe customer service framework for retrieval-augmented dialogue. Specifically, it first leverages a Learn-to-Think stage to learn the reasoning processes and response strategies from human experts, and then employs a Learn-to-Respond stage to perform cold-start supervised fine-tuning (SFT) combined with reinforcement learning (RL) for basic-to-hard self-refinement. Our method significantly enhances human-likeness and naturalness while effectively mitigating hallucinations and critical business risks. We have conducted large-scale online A/B experiments in an industry-level social customer service setting, and extensive experimental results show that OlaMind achieves significant cumulative relative improvements with intelligent resolution rates +28.92%/+18.42% and human takeover rate -6.08%/-7.12% in community-support/livestream-interaction scenarios, respectively, which highlights its consistent effectiveness across diverse real-world applications. The code and data will be publicly available.

Related papers

Higher Satisfaction, Lower Cost: A Technical Report on How LLMs Revolutionize Meituan's Intelligent Interaction Systems [67.18731675163589]
We introduce WOWService, an intelligent interaction system tailored for industrial applications.<n>With the integration of LLMs and multi-agent architectures, WOWService enables autonomous task management and collaborative problem-solving.<n> WOWService is deployed on the Meituan App, achieving significant gains in key metrics.
arXiv Detail & Related papers (2025-10-15T08:35:51Z)
Exploring the Impact of Personality Traits on Conversational Recommender Systems: A Simulation with Large Language Models [70.180385882195]
This paper introduces a personality-aware user simulation for Conversational Recommender Systems (CRSs)<n>The user agent induces customizable personality traits and preferences, while the system agent possesses the persuasion capability to simulate realistic interaction in CRSs.<n> Experimental results demonstrate that state-of-the-art LLMs can effectively generate diverse user responses aligned with specified personality traits.
arXiv Detail & Related papers (2025-04-09T13:21:17Z)
Reasoning LLMs for User-Aware Multimodal Conversational Agents [3.533721662684487]
Personalization in social robotics is critical for fostering effective human-robot interactions.<n>This paper proposes a novel framework called USER-LLM R1 for a user-aware conversational agent.<n>Our approach integrates chain-of-thought (CoT) reasoning models to iteratively infer user preferences and vision-language models.
arXiv Detail & Related papers (2025-04-02T13:00:17Z)
Towards Recommender Systems LLMs Playground (RecSysLLMsP): Exploring Polarization and Engagement in Simulated Social Networks [6.813586966214873]
This paper introduces a novel simulation framework leveraging Large Language Models (LLMs) to explore the impacts of different content recommendation setups on user engagement and polarization in social networks.<n>By creating diverse AI agents with descriptive, static, and dynamic attributes, we assess their autonomous behaviour across three scenarios: Plurality, Balanced, and Similarity.<n>Our study emphasizes the need for a careful balance in recommender system designs to enhance user satisfaction while mitigating societal polarization.
arXiv Detail & Related papers (2025-01-29T14:23:34Z)
Evaluating Cultural and Social Awareness of LLM Web Agents [113.49968423990616]
We introduce CASA, a benchmark designed to assess large language models' sensitivity to cultural and social norms.<n>Our approach evaluates LLM agents' ability to detect and appropriately respond to norm-violating user queries and observations.<n>Experiments show that current LLMs perform significantly better in non-agent environments.
arXiv Detail & Related papers (2024-10-30T17:35:44Z)
RAG based Question-Answering for Contextual Response Prediction System [0.4660328753262075]
Large Language Models (LLMs) have shown versatility in various Natural Language Processing (NLP) tasks. Retrieval Augmented Generation (RAG) emerges as a promising technique to address this challenge. This paper introduces an end-to-end framework that employs LLMs with RAG capabilities for industry use cases.
arXiv Detail & Related papers (2024-09-05T17:14:23Z)
Persona-DB: Efficient Large Language Model Personalization for Response Prediction with Collaborative Data Refinement [79.2400720115588]
We introduce Persona-DB, a simple yet effective framework consisting of a hierarchical construction process to improve generalization across task contexts.<n>In the evaluation of response prediction, Persona-DB demonstrates superior context efficiency in maintaining accuracy with a significantly reduced retrieval size.<n>Our experiments also indicate a marked improvement of over 10% under cold-start scenarios, when users have extremely sparse data.
arXiv Detail & Related papers (2024-02-16T20:20:43Z)
Decoding the Silent Majority: Inducing Belief Augmented Social Graph with Large Language Model for Response Forecasting [74.68371461260946]
SocialSense is a framework that induces a belief-centered graph on top of an existent social network, along with graph-based propagation to capture social dynamics. Our method surpasses existing state-of-the-art in experimental evaluations for both zero-shot and supervised settings.
arXiv Detail & Related papers (2023-10-20T06:17:02Z)
Scalable and Safe Remediation of Defective Actions in Self-Learning Conversational Systems [14.030576576114818]
Off-Policy reinforcement learning has been a driving force for the state-of-the-art conversational AIs. In large-scale commercial settings, it is often challenging to balance between policy improvements and experience continuity. We propose a method for curating and leveraging high-precision samples sourced from historical regression incident reports.
arXiv Detail & Related papers (2023-05-17T19:22:24Z)
Straggler-Resilient Personalized Federated Learning [55.54344312542944]
Federated learning allows training models from samples distributed across a large network of clients while respecting privacy and communication restrictions. We develop a novel algorithmic procedure with theoretical speedup guarantees that simultaneously handles two of these hurdles. Our method relies on ideas from representation learning theory to find a global common representation using all clients' data and learn a user-specific set of parameters leading to a personalized solution for each client.
arXiv Detail & Related papers (2022-06-05T01:14:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.