The 2nd FutureDial Challenge: Dialog Systems with Retrieval Augmented Generation (FutureDial-RAG)
- URL: http://arxiv.org/abs/2405.13084v2
- Date: Sun, 15 Sep 2024 15:03:59 GMT
- Title: The 2nd FutureDial Challenge: Dialog Systems with Retrieval Augmented Generation (FutureDial-RAG)
- Authors: Yucheng Cai, Si Chen, Yuxuan Wu, Yi Huang, Junlan Feng, Zhijian Ou,
- Abstract summary: The challenge builds upon the MobileCS2 dataset, a real-life customer service datasets with nearly 3000 high-quality dialogs.
We define two tasks, track 1 for knowledge retrieval and track 2 for response generation, which are core research questions in dialog systems with RAG.
We build baseline systems for the two tracks and design metrics to measure whether the systems can perform accurate retrieval and generate informative and coherent response.
- Score: 23.849336345191556
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, increasing research interests have focused on retrieval augmented generation (RAG) to mitigate hallucination for large language models (LLMs). Following this trend, we launch the FutureDial-RAG challenge at SLT 2024, which aims at promoting the study of RAG for dialog systems. The challenge builds upon the MobileCS2 dataset, a real-life customer service datasets with nearly 3000 high-quality dialogs containing annotations for knowledge base query and corresponding results. Over the dataset, we define two tasks, track 1 for knowledge retrieval and track 2 for response generation, which are core research questions in dialog systems with RAG. We build baseline systems for the two tracks and design metrics to measure whether the systems can perform accurate retrieval and generate informative and coherent response. The baseline results show that it is very challenging to perform well on the two tasks, which encourages the participating teams and the community to study how to make better use of RAG for real-life dialog systems.
Related papers
- Boost, Disentangle, and Customize: A Robust System2-to-System1 Pipeline for Code Generation [58.799397354312596]
Large language models (LLMs) have demonstrated remarkable capabilities in various domains, particularly in system 1 tasks.
Recent research on System2-to-System1 methods surge, exploring the System 2 reasoning knowledge via inference-time computation.
In this paper, we focus on code generation, which is a representative System 2 task, and identify two primary challenges.
arXiv Detail & Related papers (2025-02-18T03:20:50Z) - GeAR: Graph-enhanced Agent for Retrieval-augmented Generation [12.805134136960998]
By design, conventional sparse or dense retrievers face challenges in multi-hop retrieval scenarios.
We present GeAR, which advances RAG performance through two key innovations: (i) graph expansion, which enhances any conventional base retriever, such as BM25, and (ii) an agent framework that incorporates graph expansion.
Our evaluation demonstrates GeAR's superior retrieval performance on three multi-hop question answering datasets.
arXiv Detail & Related papers (2024-12-24T13:45:22Z) - Adaptive Retrieval-Augmented Generation for Conversational Systems [25.35137570524043]
This study investigates the need for each turn of system response to be augmented with external knowledge.
By leveraging human judgements on the binary choice of adaptive augmentation, we develop RAGate, a gating model.
arXiv Detail & Related papers (2024-07-31T16:04:03Z) - Ragnarök: A Reusable RAG Framework and Baselines for TREC 2024 Retrieval-Augmented Generation Track [51.25144287084172]
It is crucial to have an arena to build, test, visualize, and systematically evaluate RAG-based search systems.
We propose the TREC 2024 RAG Track to foster innovation in evaluating RAG systems.
arXiv Detail & Related papers (2024-06-24T17:37:52Z) - DuetRAG: Collaborative Retrieval-Augmented Generation [57.440772556318926]
Collaborative Retrieval-Augmented Generation framework, DuetRAG, proposed.
bootstrapping philosophy is to simultaneously integrate the domain fintuning and RAG models.
arXiv Detail & Related papers (2024-05-12T09:48:28Z) - Towards a Search Engine for Machines: Unified Ranking for Multiple Retrieval-Augmented Large Language Models [21.115495457454365]
uRAG is a framework with a unified retrieval engine that serves multiple downstream retrieval-augmented generation (RAG) systems.
We build a large-scale experimentation ecosystem consisting of 18 RAG systems that engage in training and 18 unknown RAG systems that use the uRAG as the new users of the search engine.
arXiv Detail & Related papers (2024-04-30T19:51:37Z) - Response Enhanced Semi-supervised Dialogue Query Generation [40.17161986495854]
We propose a semi-supervised learning framework -- SemiDQG -- to improve model performance with unlabeled conversations.
We first apply a similarity-based query selection strategy to select high-quality RA-generated pseudo queries.
We adopt the REINFORCE algorithm to further enhance QP, with RA-provided rewards as fine-grained training signals.
arXiv Detail & Related papers (2023-12-20T02:19:54Z) - Dual Semantic Knowledge Composed Multimodal Dialog Systems [114.52730430047589]
We propose a novel multimodal task-oriented dialog system named MDS-S2.
It acquires the context related attribute and relation knowledge from the knowledge base.
We also devise a set of latent query variables to distill the semantic information from the composed response representation.
arXiv Detail & Related papers (2023-05-17T06:33:26Z) - Information Extraction and Human-Robot Dialogue towards Real-life Tasks:
A Baseline Study with the MobileCS Dataset [52.22314870976088]
The SereTOD challenge is organized and releases the MobileCS dataset, which consists of real-world dialog transcripts between real users and customer-service staffs from China Mobile.
Based on the MobileCS dataset, the SereTOD challenge has two tasks, not only evaluating the construction of the dialogue system itself, but also examining information extraction from dialog transcripts.
This paper mainly presents a baseline study of the two tasks with the MobileCS dataset.
arXiv Detail & Related papers (2022-09-27T15:30:43Z) - Modelling Hierarchical Structure between Dialogue Policy and Natural
Language Generator with Option Framework for Task-oriented Dialogue System [49.39150449455407]
HDNO is an option framework for designing latent dialogue acts to avoid designing specific dialogue act representations.
We test HDNO on MultiWoz 2.0 and MultiWoz 2.1, the datasets on multi-domain dialogues, in comparison with word-level E2E model trained with RL, LaRL and HDSA.
arXiv Detail & Related papers (2020-06-11T20:55:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.