Adapting Document-Grounded Dialog Systems to Spoken Conversations using
Data Augmentation and a Noisy Channel Model
- URL: http://arxiv.org/abs/2112.08844v1
- Date: Thu, 16 Dec 2021 12:51:52 GMT
- Title: Adapting Document-Grounded Dialog Systems to Spoken Conversations using
Data Augmentation and a Noisy Channel Model
- Authors: David Thulke, Nico Daheim, Christian Dugast, Hermann Ney
- Abstract summary: This paper summarizes our submission to Task 2 of the 10th Dialog System Technology Challenge (DSTC10) "Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations"
Similar to the previous year's iteration, the task consists of three subtasks: detecting whether a turn is knowledge seeking, selecting the relevant knowledge document and finally generating a grounded response.
Our best system achieved the 1st rank in the automatic and the 3rd rank in the human evaluation of the challenge.
- Score: 46.93744191416991
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper summarizes our submission to Task 2 of the second track of the
10th Dialog System Technology Challenge (DSTC10) "Knowledge-grounded
Task-oriented Dialogue Modeling on Spoken Conversations". Similar to the
previous year's iteration, the task consists of three subtasks: detecting
whether a turn is knowledge seeking, selecting the relevant knowledge document
and finally generating a grounded response. This year, the focus lies on
adapting the system to noisy ASR transcripts. We explore different approaches
to make the models more robust to this type of input and to adapt the generated
responses to the style of spoken conversations. For the latter, we get the best
results with a noisy channel model that additionally reduces the number of
short and generic responses. Our best system achieved the 1st rank in the
automatic and the 3rd rank in the human evaluation of the challenge.
Related papers
- WavChat: A Survey of Spoken Dialogue Models [66.82775211793547]
Recent advancements in spoken dialogue models, exemplified by systems like GPT-4o, have captured significant attention in the speech domain.
These advanced spoken dialogue models not only comprehend audio, music, and other speech-related features, but also capture stylistic and timbral characteristics in speech.
Despite the progress in spoken dialogue systems, there is a lack of comprehensive surveys that systematically organize and analyze these systems.
arXiv Detail & Related papers (2024-11-15T04:16:45Z) - Task-oriented Document-Grounded Dialog Systems by HLTPR@RWTH for DSTC9
and DSTC10 [40.05826687535019]
This paper summarizes our contributions to the document-grounded dialog tasks at the 9th and 10th Dialog System Technology Challenges.
In both iterations the task consists of three subtasks: first detect whether the current turn is knowledge seeking, second select a relevant knowledge document, and third generate a response grounded on the selected document.
arXiv Detail & Related papers (2023-04-14T12:46:29Z) - Deploying a Retrieval based Response Model for Task Oriented Dialogues [8.671263996400844]
Task-oriented dialogue systems need to have high conversational capability, be easily adaptable to changing situations and conform to business constraints.
This paper describes a 3-step procedure to develop a conversational model that satisfies these criteria and can efficiently scale to rank a large set of response candidates.
arXiv Detail & Related papers (2022-10-25T23:10:19Z) - KETOD: Knowledge-Enriched Task-Oriented Dialogue [77.59814785157877]
Existing studies in dialogue system research mostly treat task-oriented dialogue and chit-chat as separate domains.
We investigate how task-oriented dialogue and knowledge-grounded chit-chat can be effectively integrated into a single model.
arXiv Detail & Related papers (2022-05-11T16:01:03Z) - End-to-end Spoken Conversational Question Answering: Task, Dataset and
Model [92.18621726802726]
In spoken question answering, the systems are designed to answer questions from contiguous text spans within the related speech transcripts.
We propose a new Spoken Conversational Question Answering task (SCQA), aiming at enabling the systems to model complex dialogue flows.
Our main objective is to build the system to deal with conversational questions based on the audio recordings, and to explore the plausibility of providing more cues from different modalities with systems in information gathering.
arXiv Detail & Related papers (2022-04-29T17:56:59Z) - Smoothing Dialogue States for Open Conversational Machine Reading [70.83783364292438]
We propose an effective gating strategy by smoothing the two dialogue states in only one decoder and bridge decision making and question generation.
Experiments on the OR-ShARC dataset show the effectiveness of our method, which achieves new state-of-the-art results.
arXiv Detail & Related papers (2021-08-28T08:04:28Z) - Learning to Retrieve Entity-Aware Knowledge and Generate Responses with
Copy Mechanism for Task-Oriented Dialogue Systems [43.57597820119909]
Task-oriented conversational modeling with unstructured knowledge access, as track 1 of the 9th Dialogue System Technology Challenges (DSTC 9)
This challenge can be separated into three subtasks, (1) knowledge-seeking turn detection, (2) knowledge selection, and (3) knowledge-grounded response generation.
We use pre-trained language models, ELECTRA and RoBERTa, as our base encoder for different subtasks.
arXiv Detail & Related papers (2020-12-22T11:36:37Z) - Towards Data Distillation for End-to-end Spoken Conversational Question
Answering [65.124088336738]
We propose a new Spoken Conversational Question Answering task (SCQA)
SCQA aims at enabling QA systems to model complex dialogues flow given the speech utterances and text corpora.
Our main objective is to build a QA system to deal with conversational questions both in spoken and text forms.
arXiv Detail & Related papers (2020-10-18T05:53:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.