CAiRE in DialDoc21: Data Augmentation for Information-Seeking Dialogue
System
- URL: http://arxiv.org/abs/2106.03530v2
- Date: Tue, 8 Jun 2021 03:07:02 GMT
- Title: CAiRE in DialDoc21: Data Augmentation for Information-Seeking Dialogue
System
- Authors: Etsuko Ishii, Yan Xu, Genta Indra Winata, Zhaojiang Lin, Andrea
Madotto, Zihan Liu, Peng Xu, Pascale Fung
- Abstract summary: In DialDoc21 competition, our system achieved 74.95 F1 score and 60.74 Exact Match score in subtask 1, and 37.72 SacreBLEU score in subtask 2.
- Score: 55.43871578056878
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Information-seeking dialogue systems, including knowledge identification and
response generation, aim to respond to users with fluent, coherent, and
informative responses based on users' needs, which. To tackle this challenge,
we utilize data augmentation methods and several training techniques with the
pre-trained language models to learn a general pattern of the task and thus
achieve promising performance. In DialDoc21 competition, our system achieved
74.95 F1 score and 60.74 Exact Match score in subtask 1, and 37.72 SacreBLEU
score in subtask 2. Empirical analysis is provided to explain the effectiveness
of our approaches.
Related papers
- First Place Solution to the CVPR'2023 AQTC Challenge: A
Function-Interaction Centric Approach with Spatiotemporal Visual-Language
Alignment [15.99008977852437]
Affordance-Centric Question-driven Task Completion (AQTC) has been proposed to acquire from videos to users with comprehensive and systematic instructions.
Existing methods have neglected the necessity of aligning visual and linguistic signals, as well as the crucial interactional information between humans objects.
We propose to combine largescale pre-trained vision- and video-language models, which serve to contribute stable and reliable multimodal data.
arXiv Detail & Related papers (2023-06-23T09:02:25Z) - Knowledge-Retrieval Task-Oriented Dialog Systems with Semi-Supervision [22.249113574918034]
Most existing task-oriented dialog (TOD) systems track dialog states in terms of slots and values and use them to query a database to get relevant knowledge to generate responses.
In real-life applications, user utterances are noisier, and thus it is more difficult to accurately track dialog states and correctly secure relevant knowledge.
Inspired by such progress, we propose a retrieval-based method to enhance knowledge selection in TOD systems, which outperforms the traditional database query method for real-life dialogs.
arXiv Detail & Related papers (2023-05-22T16:29:20Z) - Task-oriented Document-Grounded Dialog Systems by HLTPR@RWTH for DSTC9
and DSTC10 [40.05826687535019]
This paper summarizes our contributions to the document-grounded dialog tasks at the 9th and 10th Dialog System Technology Challenges.
In both iterations the task consists of three subtasks: first detect whether the current turn is knowledge seeking, second select a relevant knowledge document, and third generate a response grounded on the selected document.
arXiv Detail & Related papers (2023-04-14T12:46:29Z) - FCC: Fusing Conversation History and Candidate Provenance for Contextual
Response Ranking in Dialogue Systems [53.89014188309486]
We present a flexible neural framework that can integrate contextual information from multiple channels.
We evaluate our model on the MSDialog dataset widely used for evaluating conversational response ranking tasks.
arXiv Detail & Related papers (2023-03-31T23:58:28Z) - Socratic Pretraining: Question-Driven Pretraining for Controllable
Summarization [89.04537372465612]
Socratic pretraining is a question-driven, unsupervised pretraining objective designed to improve controllability in summarization tasks.
Our results show that Socratic pretraining cuts task-specific labeled data requirements in half.
arXiv Detail & Related papers (2022-12-20T17:27:10Z) - CGoDial: A Large-Scale Benchmark for Chinese Goal-oriented Dialog
Evaluation [75.60156479374416]
CGoDial is a new challenging and comprehensive Chinese benchmark for Goal-oriented Dialog evaluation.
It contains 96,763 dialog sessions and 574,949 dialog turns totally, covering three datasets with different knowledge sources.
To bridge the gap between academic benchmarks and spoken dialog scenarios, we either collect data from real conversations or add spoken features to existing datasets via crowd-sourcing.
arXiv Detail & Related papers (2022-11-21T16:21:41Z) - Retrieval-Free Knowledge-Grounded Dialogue Response Generation with
Adapters [52.725200145600624]
We propose KnowExpert to bypass the retrieval process by injecting prior knowledge into the pre-trained language models with lightweight adapters.
Experimental results show that KnowExpert performs comparably with the retrieval-based baselines.
arXiv Detail & Related papers (2021-05-13T12:33:23Z) - Learning to Retrieve Entity-Aware Knowledge and Generate Responses with
Copy Mechanism for Task-Oriented Dialogue Systems [43.57597820119909]
Task-oriented conversational modeling with unstructured knowledge access, as track 1 of the 9th Dialogue System Technology Challenges (DSTC 9)
This challenge can be separated into three subtasks, (1) knowledge-seeking turn detection, (2) knowledge selection, and (3) knowledge-grounded response generation.
We use pre-trained language models, ELECTRA and RoBERTa, as our base encoder for different subtasks.
arXiv Detail & Related papers (2020-12-22T11:36:37Z) - TREC CAsT 2019: The Conversational Assistance Track Overview [34.65827453762031]
The Conversational Assistance Track (CAsT) is a new track for TREC 2019 to facilitate Conversational Information Seeking (CIS) research.
The document corpus is 38,426,252 passages from the TREC Complex Answer Retrieval (CAR) and Microsoft MAchine Reading COmprehension (MARCO) datasets.
This year 21 groups submitted a total of 65 runs using varying methods for conversational query understanding and ranking.
arXiv Detail & Related papers (2020-03-30T16:58:04Z) - Low-Resource Knowledge-Grounded Dialogue Generation [74.09352261943913]
We consider knowledge-grounded dialogue generation under a natural assumption that only limited training examples are available.
We devise a disentangled response decoder in order to isolate parameters that depend on knowledge-grounded dialogues from the entire generation model.
With only 1/8 training data, our model can achieve the state-of-the-art performance and generalize well on out-of-domain knowledge.
arXiv Detail & Related papers (2020-02-24T16:20:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.