Open Challenge for Correcting Errors of Speech Recognition Systems
- URL: http://arxiv.org/abs/2001.03041v1
- Date: Thu, 9 Jan 2020 15:07:32 GMT
- Title: Open Challenge for Correcting Errors of Speech Recognition Systems
- Authors: Marek Kubis, Zygmunt Vetulani, Miko{\l}aj Wypych, Tomasz
Zi\k{e}tkiewicz
- Abstract summary: The goal of the challenge is to investigate methods of correcting the recognition results on the basis of previously made errors by the speech processing system.
The dataset prepared for the task is described and evaluation criteria are presented.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The paper announces the new long-term challenge for improving the performance
of automatic speech recognition systems. The goal of the challenge is to
investigate methods of correcting the recognition results on the basis of
previously made errors by the speech processing system. The dataset prepared
for the task is described and evaluation criteria are presented.
Related papers
- Back Transcription as a Method for Evaluating Robustness of Natural
Language Understanding Models to Speech Recognition Errors [0.4681661603096333]
In a spoken dialogue system, an NLU model is preceded by a speech recognition system that can deteriorate the performance of natural language understanding.
This paper proposes a method for investigating the impact of speech recognition errors on the performance of natural language understanding models.
arXiv Detail & Related papers (2023-10-25T13:07:07Z) - ed-cec: improving rare word recognition using asr postprocessing based
on error detection and context-aware error correction [30.486396813844195]
We present a novel ASR postprocessing method that focuses on improving the recognition of rare words through error detection and context-aware error correction.
Experimental results across five datasets demonstrate that our proposed method achieves significantly lower word error rates (WERs) than previous approaches.
arXiv Detail & Related papers (2023-10-08T11:40:30Z) - Contextual-Utterance Training for Automatic Speech Recognition [65.4571135368178]
We propose a contextual-utterance training technique which makes use of the previous and future contextual utterances.
Also, we propose a dual-mode contextual-utterance training technique for streaming automatic speech recognition (ASR) systems.
The proposed technique is able to reduce both the WER and the average last token emission latency by more than 6% and 40ms relative.
arXiv Detail & Related papers (2022-10-27T08:10:44Z) - End-to-end Speech-to-Punctuated-Text Recognition [23.44236710364419]
punctuation marks are important for the readability of the speech recognition results.
Conventional automatic speech recognition systems do not produce punctuation marks.
We propose an end-to-end model that takes speech as input and outputs punctuated texts.
arXiv Detail & Related papers (2022-07-07T08:58:01Z) - Towards End-to-end Unsupervised Speech Recognition [120.4915001021405]
We introduce wvu which does away with all audio-side pre-processing and improves accuracy through better architecture.
In addition, we introduce an auxiliary self-supervised objective that ties model predictions back to the input.
Experiments show that wvuimproves unsupervised recognition results across different languages while being conceptually simpler.
arXiv Detail & Related papers (2022-04-05T21:22:38Z) - Curriculum optimization for low-resource speech recognition [4.803994937990389]
We propose an automated curriculum learning approach to optimize the sequence of training examples.
We introduce a new difficulty measure called compression ratio that can be used as a scoring function for raw audio in various noise conditions.
arXiv Detail & Related papers (2022-02-17T19:47:50Z) - Recent Progress in the CUHK Dysarthric Speech Recognition System [66.69024814159447]
Disordered speech presents a wide spectrum of challenges to current data intensive deep neural networks (DNNs) based automatic speech recognition technologies.
This paper presents recent research efforts at the Chinese University of Hong Kong to improve the performance of disordered speech recognition systems.
arXiv Detail & Related papers (2022-01-15T13:02:40Z) - Contextualized Attention-based Knowledge Transfer for Spoken
Conversational Question Answering [63.72278693825945]
Spoken conversational question answering (SCQA) requires machines to model complex dialogue flow.
We propose CADNet, a novel contextualized attention-based distillation approach.
We conduct extensive experiments on the Spoken-CoQA dataset and demonstrate that our approach achieves remarkable performance.
arXiv Detail & Related papers (2020-10-21T15:17:18Z) - A Machine of Few Words -- Interactive Speaker Recognition with
Reinforcement Learning [35.36769027019856]
We present a new paradigm for automatic speaker recognition that we call Interactive Speaker Recognition (ISR)
In this paradigm, the recognition system aims to incrementally build a representation of the speakers by requesting personalized utterances.
We show that our method achieves excellent performance while using little speech signal amounts.
arXiv Detail & Related papers (2020-08-07T12:44:08Z) - Segment Aggregation for short utterances speaker verification using raw
waveforms [47.41124427552161]
We propose a method that compensates for the performance degradation of speaker verification for short utterances.
The proposed method adopts an ensemble-based design to improve the stability and accuracy of speaker verification systems.
arXiv Detail & Related papers (2020-05-07T08:57:22Z) - Deep Speaker Embeddings for Far-Field Speaker Recognition on Short
Utterances [53.063441357826484]
Speaker recognition systems based on deep speaker embeddings have achieved significant performance in controlled conditions.
Speaker verification on short utterances in uncontrolled noisy environment conditions is one of the most challenging and highly demanded tasks.
This paper presents approaches aimed to achieve two goals: a) improve the quality of far-field speaker verification systems in the presence of environmental noise, reverberation and b) reduce the system qualitydegradation for short utterances.
arXiv Detail & Related papers (2020-02-14T13:34:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.