Knowledge Distillation for Improved Accuracy in Spoken Question
Answering
- URL: http://arxiv.org/abs/2010.11067v3
- Date: Thu, 1 Apr 2021 02:26:27 GMT
- Title: Knowledge Distillation for Improved Accuracy in Spoken Question
Answering
- Authors: Chenyu You, Nuo Chen, Yuexian Zou
- Abstract summary: We devise a training strategy to perform knowledge distillation from spoken documents and written counterparts.
Our work makes a step towards distilling knowledge from the language model as a supervision signal.
Experiments demonstrate that our approach outperforms several state-of-the-art language models on the Spoken-SQuAD dataset.
- Score: 63.72278693825945
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Spoken question answering (SQA) is a challenging task that requires the
machine to fully understand the complex spoken documents. Automatic speech
recognition (ASR) plays a significant role in the development of QA systems.
However, the recent work shows that ASR systems generate highly noisy
transcripts, which critically limit the capability of machine comprehension on
the SQA task. To address the issue, we present a novel distillation framework.
Specifically, we devise a training strategy to perform knowledge distillation
(KD) from spoken documents and written counterparts. Our work makes a step
towards distilling knowledge from the language model as a supervision signal to
lead to better student accuracy by reducing the misalignment between automatic
and manual transcriptions. Experiments demonstrate that our approach
outperforms several state-of-the-art language models on the Spoken-SQuAD
dataset.
Related papers
- End-to-end Spoken Conversational Question Answering: Task, Dataset and
Model [92.18621726802726]
In spoken question answering, the systems are designed to answer questions from contiguous text spans within the related speech transcripts.
We propose a new Spoken Conversational Question Answering task (SCQA), aiming at enabling the systems to model complex dialogue flows.
Our main objective is to build the system to deal with conversational questions based on the audio recordings, and to explore the plausibility of providing more cues from different modalities with systems in information gathering.
arXiv Detail & Related papers (2022-04-29T17:56:59Z) - DUAL: Textless Spoken Question Answering with Speech Discrete Unit
Adaptive Learning [66.71308154398176]
Spoken Question Answering (SQA) has gained research attention and made remarkable progress in recent years.
Existing SQA methods rely on Automatic Speech Recognition (ASR) transcripts, which are time and cost-prohibitive to collect.
This work proposes an ASR transcript-free SQA framework named Discrete Unit Adaptive Learning (DUAL), which leverages unlabeled data for pre-training and is fine-tuned by the SQA downstream task.
arXiv Detail & Related papers (2022-03-09T17:46:22Z) - An Initial Investigation of Non-Native Spoken Question-Answering [36.89541375786233]
We show that a simple text-based ELECTRA MC model trained on SQuAD2.0 transfers well for spoken question answering tests.
One significant challenge is the lack of appropriately annotated speech corpora to train systems for this task.
Mismatches must be considered between text documents and spoken responses; non-native spoken grammar and written grammar.
arXiv Detail & Related papers (2021-07-09T21:59:16Z) - Contextualized Attention-based Knowledge Transfer for Spoken
Conversational Question Answering [63.72278693825945]
Spoken conversational question answering (SCQA) requires machines to model complex dialogue flow.
We propose CADNet, a novel contextualized attention-based distillation approach.
We conduct extensive experiments on the Spoken-CoQA dataset and demonstrate that our approach achieves remarkable performance.
arXiv Detail & Related papers (2020-10-21T15:17:18Z) - Towards Data Distillation for End-to-end Spoken Conversational Question
Answering [65.124088336738]
We propose a new Spoken Conversational Question Answering task (SCQA)
SCQA aims at enabling QA systems to model complex dialogues flow given the speech utterances and text corpora.
Our main objective is to build a QA system to deal with conversational questions both in spoken and text forms.
arXiv Detail & Related papers (2020-10-18T05:53:39Z) - Improving Readability for Automatic Speech Recognition Transcription [50.86019112545596]
We propose a novel NLP task called ASR post-processing for readability (APR)
APR aims to transform the noisy ASR output into a readable text for humans and downstream tasks while maintaining the semantic meaning of the speaker.
We compare fine-tuned models based on several open-sourced and adapted pre-trained models with the traditional pipeline method.
arXiv Detail & Related papers (2020-04-09T09:26:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.