Related papers: FACTS About Building Retrieval Augmented Generation-based Chatbots

FACTS About Building Retrieval Augmented Generation-based Chatbots

URL: http://arxiv.org/abs/2407.07858v1
Date: Wed, 10 Jul 2024 17:20:59 GMT
Title: FACTS About Building Retrieval Augmented Generation-based Chatbots
Authors: Rama Akkiraju, Anbang Xu, Deepak Bora, Tan Yu, Lu An, Vishal Seth, Aaditya Shukla, Pritam Gundecha, Hridhay Mehta, Ashwin Jha, Prithvi Raj, Abhinav Balasubramanian, Murali Maram, Guru Muthusamy, Shivakesh Reddy Annepally, Sidney Knowles, Min Du, Nick Burnett, Sean Javiya, Ashok Marannan, Mamta Kumari, Surbhi Jha, Ethan Dereszenski, Anupam Chakraborty, Subhash Ranjan, Amina Terfai, Anoop Surya, Tracey Mercer, Vinodh Kumar Thanigachalam, Tamar Bar, Sanjana Krishnan, Samy Kilaru, Jasmine Jaksic, Nave Algarici, Jacob Liberman, Joey Conway, Sonu Nayyar, Justin Boitano,
Abstract summary: We present a framework for building RAG-based chatbots based on our experience with three NVIDIA chatbots. Our contributions are three-fold: introducing the FACTS framework (Freshness, Architectures, Cost, Testing, Security), presenting fifteen RAG pipeline control points, and providing empirical results on accuracy-latency tradeoffs between large and small LLMs.
Score: 10.437472320378339
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Enterprise chatbots, powered by generative AI, are emerging as key applications to enhance employee productivity. Retrieval Augmented Generation (RAG), Large Language Models (LLMs), and orchestration frameworks like Langchain and Llamaindex are crucial for building these chatbots. However, creating effective enterprise chatbots is challenging and requires meticulous RAG pipeline engineering. This includes fine-tuning embeddings and LLMs, extracting documents from vector databases, rephrasing queries, reranking results, designing prompts, honoring document access controls, providing concise responses, including references, safeguarding personal information, and building orchestration agents. We present a framework for building RAG-based chatbots based on our experience with three NVIDIA chatbots: for IT/HR benefits, financial earnings, and general content. Our contributions are three-fold: introducing the FACTS framework (Freshness, Architectures, Cost, Testing, Security), presenting fifteen RAG pipeline control points, and providing empirical results on accuracy-latency tradeoffs between large and small LLMs. To the best of our knowledge, this is the first paper of its kind that provides a holistic view of the factors as well as solutions for building secure enterprise-grade chatbots."

Related papers

A Transformer-based Approach for Augmenting Software Engineering Chatbots Datasets [4.311626046942916]
We present an automated transformer-based approach to augment software engineering datasets. We evaluate the impact of using the augmentation approach on the Rasa NLU's performance using three software engineering datasets.
arXiv Detail & Related papers (2024-07-16T17:48:44Z)
ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning [74.58666091522198]
We present a framework for intuitive robot programming by non-experts. We leverage natural language prompts and contextual information from the Robot Operating System (ROS) Our system integrates large language models (LLMs), enabling non-experts to articulate task requirements to the system through a chat interface.
arXiv Detail & Related papers (2024-06-28T08:28:38Z)
A Complete Survey on LLM-based AI Chatbots [46.18523139094807]
The past few decades have witnessed an upsurge in data, forming the foundation for data-hungry, learning-based AI technology. Conversational agents, often referred to as AI chatbots, rely heavily on such data to train large language models (LLMs) and generate new content (knowledge) in response to user prompts. This paper presents a complete survey of the evolution and deployment of LLM-based chatbots in various sectors.
arXiv Detail & Related papers (2024-06-17T09:39:34Z)
LEARN: Knowledge Adaptation from Large Language Model to Recommendation for Practical Industrial Application [54.984348122105516]
Llm-driven knowlEdge Adaptive RecommeNdation (LEARN) framework synergizes open-world knowledge with collaborative knowledge. We propose an Llm-driven knowlEdge Adaptive RecommeNdation (LEARN) framework that synergizes open-world knowledge with collaborative knowledge.
arXiv Detail & Related papers (2024-05-07T04:00:30Z)
STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases [93.96463520716759]
We develop STARK, a large-scale Semi-structure retrieval benchmark on Textual and Knowledge Bases. Our benchmark covers three domains: product search, academic paper search, and queries in precision medicine. We design a novel pipeline to synthesize realistic user queries that integrate diverse relational information and complex textual properties.
arXiv Detail & Related papers (2024-04-19T22:54:54Z)
Search-Engine-augmented Dialogue Response Generation with Cheaply Supervised Query Production [98.98161995555485]
We propose a dialogue model that can access the vast and dynamic information from any search engine for response generation. As the core module, a query producer is used to generate queries from a dialogue context to interact with a search engine. Experiments show that our query producer can achieve R@1 and R@5 rates of 62.4% and 74.8% for retrieving gold knowledge.
arXiv Detail & Related papers (2023-02-16T01:58:10Z)
AI Based Chatbot: An Approach of Utilizing On Customer Service Assistance [0.0]
The project aims to develop the system that could comply with complex questions and logical output answers. The ultimate goal is to give high-quality results (answers) based on user input (question)
arXiv Detail & Related papers (2022-06-18T00:59:10Z)
KILT: a Benchmark for Knowledge Intensive Language Tasks [102.33046195554886]
We present a benchmark for knowledge-intensive language tasks (KILT) All tasks in KILT are grounded in the same snapshot of Wikipedia. We find that a shared dense vector index coupled with a seq2seq model is a strong baseline.
arXiv Detail & Related papers (2020-09-04T15:32:19Z)
Conversations with Search Engines: SERP-based Conversational Response Generation [77.1381159789032]
We create a suitable dataset, the Search as a Conversation (SaaC) dataset, for the development of pipelines for conversations with search engines. We also develop a state-of-the-art pipeline for conversations with search engines, the Conversations with Search Engines (CaSE) using this dataset. CaSE enhances the state-of-the-art by introducing a supporting token identification module and aprior-aware pointer generator.
arXiv Detail & Related papers (2020-04-29T13:07:53Z)
Building chatbots from large scale domain-specific knowledge bases: challenges and opportunities [4.129225533930966]
We describe the challenges and lessons learned from building a large scale virtual assistant for understanding and responding to equipment-related complaints. We show through evaluation on a real dataset that the proposed framework, compared to off-the-shelf popular ones, scales better with large volume of entities being up to 30% more accurate.
arXiv Detail & Related papers (2019-12-31T22:40:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.