Related papers: RAG based Question-Answering for Contextual Response Prediction System

RAG based Question-Answering for Contextual Response Prediction System

URL: http://arxiv.org/abs/2409.03708v2
Date: Fri, 6 Sep 2024 14:18:20 GMT
Title: RAG based Question-Answering for Contextual Response Prediction System
Authors: Sriram Veturi, Saurabh Vaichal, Reshma Lal Jagadheesh, Nafis Irtiza Tripto, Nian Yan,
Abstract summary: Large Language Models (LLMs) have shown versatility in various Natural Language Processing (NLP) tasks. Retrieval Augmented Generation (RAG) emerges as a promising technique to address this challenge. This paper introduces an end-to-end framework that employs LLMs with RAG capabilities for industry use cases.
Score: 0.4660328753262075
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) have shown versatility in various Natural Language Processing (NLP) tasks, including their potential as effective question-answering systems. However, to provide precise and relevant information in response to specific customer queries in industry settings, LLMs require access to a comprehensive knowledge base to avoid hallucinations. Retrieval Augmented Generation (RAG) emerges as a promising technique to address this challenge. Yet, developing an accurate question-answering framework for real-world applications using RAG entails several challenges: 1) data availability issues, 2) evaluating the quality of generated content, and 3) the costly nature of human evaluation. In this paper, we introduce an end-to-end framework that employs LLMs with RAG capabilities for industry use cases. Given a customer query, the proposed system retrieves relevant knowledge documents and leverages them, along with previous chat history, to generate response suggestions for customer service agents in the contact centers of a major retail company. Through comprehensive automated and human evaluations, we show that this solution outperforms the current BERT-based algorithms in accuracy and relevance. Our findings suggest that RAG-based LLMs can be an excellent support to human customer service representatives by lightening their workload.

Related papers

Teaching Language Models To Gather Information Proactively [53.85419549904644]
Large language models (LLMs) are increasingly expected to function as collaborative partners.<n>In this work, we introduce a new task paradigm: proactive information gathering.<n>We design a scalable framework that generates partially specified, real-world tasks, masking key information.<n>Within this setup, our core innovation is a reinforcement finetuning strategy that rewards questions that elicit genuinely new, implicit user information.
arXiv Detail & Related papers (2025-07-28T23:50:09Z)
Federated In-Context Learning: Iterative Refinement for Improved Answer Quality [62.72381208029899]
In-context learning (ICL) enables language models to generate responses without modifying their parameters by leveraging examples provided in the input.<n>We propose Federated In-Context Learning (Fed-ICL), a general framework that enhances ICL through an iterative, collaborative process.<n>Fed-ICL progressively refines responses by leveraging multi-round interactions between clients and a central server, improving answer quality without the need to transmit model parameters.
arXiv Detail & Related papers (2025-06-09T05:33:28Z)
LLMs Can Generate a Better Answer by Aggregating Their Own Responses [83.69632759174405]
Large Language Models (LLMs) have shown remarkable capabilities across tasks, yet they often require additional prompting techniques when facing complex problems. We argue this limitation stems from the fact that common LLM post-training procedures lack explicit supervision for discriminative judgment tasks. We propose Generative Self-Aggregation (GSA), a novel prompting method that improves answer quality without requiring the model's discriminative capabilities.
arXiv Detail & Related papers (2025-03-06T05:25:43Z)
HawkBench: Investigating Resilience of RAG Methods on Stratified Information-Seeking Tasks [50.871243190126826]
HawkBench is a human-labeled, multi-domain benchmark designed to rigorously assess RAG performance. By stratifying tasks based on information-seeking behaviors, HawkBench provides a systematic evaluation of how well RAG systems adapt to diverse user needs.
arXiv Detail & Related papers (2025-02-19T06:33:39Z)
Knowledge Retrieval Based on Generative AI [4.9328530417790954]
This study develops a question-answering system based on Retrieval-Augmented Generation (RAG) using Chinese Wikipedia and Lawbank as retrieval sources. The system employs BGE-M3 for dense vector retrieval to obtain highly relevant search results and BGE-reranker to reorder these results based on query relevance.
arXiv Detail & Related papers (2025-01-08T17:29:46Z)
A Survey of Query Optimization in Large Language Models [10.255235456427037]
RAG mitigates the limitations of Large Language Models by dynamically retrieving and leveraging up-to-date relevant information. QO has emerged as a critical element, playing a pivotal role in determining the effectiveness of RAG's retrieval stage.
arXiv Detail & Related papers (2024-12-23T13:26:04Z)
Unanswerability Evaluation for Retrieval Augmented Generation [74.3022365715597]
UAEval4RAG is a framework designed to evaluate whether RAG systems can handle unanswerable queries effectively. We define a taxonomy with six unanswerable categories, and UAEval4RAG automatically synthesizes diverse and challenging queries.
arXiv Detail & Related papers (2024-12-16T19:11:55Z)
MAG-V: A Multi-Agent Framework for Synthetic Data Generation and Verification [5.666070277424383]
MAG-V is a framework to generate a dataset of questions that mimic customer queries. Our synthetic data can improve agent performance on actual customer queries.
arXiv Detail & Related papers (2024-11-28T19:36:11Z)
mR$^2$AG: Multimodal Retrieval-Reflection-Augmented Generation for Knowledge-Based VQA [78.45521005703958]
multimodal Retrieval-Augmented Generation (mRAG) is naturally introduced to provide MLLMs with comprehensive and up-to-date knowledge. We propose a novel framework called textbfRetrieval-textbfReftextbfAugmented textbfGeneration (mR$2$AG) which achieves adaptive retrieval and useful information localization. mR$2$AG significantly outperforms state-of-the-art MLLMs on INFOSEEK and Encyclopedic-VQA
arXiv Detail & Related papers (2024-11-22T16:15:50Z)
AGENT-CQ: Automatic Generation and Evaluation of Clarifying Questions for Conversational Search with LLMs [53.6200736559742]
AGENT-CQ consists of two stages: a generation stage and an evaluation stage. CrowdLLM simulates human crowdsourcing judgments to assess generated questions and answers. Experiments on the ClariQ dataset demonstrate CrowdLLM's effectiveness in evaluating question and answer quality.
arXiv Detail & Related papers (2024-10-25T17:06:27Z)
An Adaptive Framework for Generating Systematic Explanatory Answer in Online Q&A Platforms [62.878616839799776]
We propose SynthRAG, an innovative framework designed to enhance Question Answering (QA) performance. SynthRAG improves on conventional models by employing adaptive outlines for dynamic content structuring. An online deployment on the Zhihu platform revealed that SynthRAG's answers achieved notable user engagement.
arXiv Detail & Related papers (2024-10-23T09:14:57Z)
Beyond-RAG: Question Identification and Answer Generation in Real-Time Conversations [0.0]
In customer contact centers, human agents often struggle with long average handling times (AHT) We propose a decision support system that can look beyond RAG by first identifying customer questions in real time. If the query matches an FAQ, the system retrieves the answer directly from the FAQ database; otherwise, it generates answers via RAG.
arXiv Detail & Related papers (2024-10-14T04:06:22Z)
Retrieval Augmented Generation (RAG) and Beyond: A Comprehensive Survey on How to Make your LLMs use External Data More Wisely [8.507599833330346]
Large language models (LLMs) augmented with external data have demonstrated remarkable capabilities in completing real-world tasks. Retrieval-Augmented Generation (RAG) and fine-tuning are gaining increasing attention and widespread application. However, the effective deployment of data-augmented LLMs across various specialized fields presents substantial challenges.
arXiv Detail & Related papers (2024-09-23T11:20:20Z)
Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation [19.312330150540912]
An emerging application is using Large Language Models (LLMs) to enhance retrieval-augmented generation (RAG) capabilities. We propose FRAMES, a high-quality evaluation dataset designed to test LLMs' ability to provide factual responses. We present baseline results demonstrating that even state-of-the-art LLMs struggle with this task, achieving 0.40 accuracy with no retrieval.
arXiv Detail & Related papers (2024-09-19T17:52:07Z)
Evaluating ChatGPT on Nuclear Domain-Specific Data [0.0]
This paper examines the application of ChatGPT, a large language model (LLM), for question-and-answer (Q&A) tasks in the highly specialized field of nuclear data. The primary focus is on evaluating ChatGPT's performance on a curated test dataset. The findings underscore the improvement in performance when incorporating a RAG pipeline in an LLM.
arXiv Detail & Related papers (2024-08-26T08:17:42Z)
A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models [71.25225058845324]
Large Language Models (LLMs) have demonstrated revolutionary abilities in language understanding and generation. Retrieval-Augmented Generation (RAG) can offer reliable and up-to-date external knowledge. RA-LLMs have emerged to harness external and authoritative knowledge bases, rather than relying on the model's internal knowledge.
arXiv Detail & Related papers (2024-05-10T02:48:45Z)
PICK: Polished & Informed Candidate Scoring for Knowledge-Grounded Dialogue Systems [59.1250765143521]
Current knowledge-grounded dialogue systems often fail to align the generated responses with human-preferred qualities. We propose Polished & Informed Candidate Scoring (PICK), a generation re-scoring framework. We demonstrate the effectiveness of PICK in generating responses that are more faithful while keeping them relevant to the dialogue history.
arXiv Detail & Related papers (2023-09-19T08:27:09Z)
How Can Recommender Systems Benefit from Large Language Models: A Survey [82.06729592294322]
Large language models (LLM) have shown impressive general intelligence and human-like capabilities. We conduct a comprehensive survey on this research direction from the perspective of the whole pipeline in real-world recommender systems.
arXiv Detail & Related papers (2023-06-09T11:31:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.