Innovative Bert-based Reranking Language Models for Speech Recognition
- URL: http://arxiv.org/abs/2104.04950v1
- Date: Sun, 11 Apr 2021 07:55:41 GMT
- Title: Innovative Bert-based Reranking Language Models for Speech Recognition
- Authors: Shih-Hsuan Chiu and Berlin Chen
- Abstract summary: We present a novel instantiation of the BERT-based contextualized language models (LMs) for use in reranking of N-best hypotheses produced by automatic speech recognition (ASR)
To this end, we frame N-best hypothesis reranking with BERT as a prediction problem, which aims to predict the oracle hypothesis that has the lowest word error rate (WER) given the N-best hypotheses (denoted by PBERT)
In particular, we explore to capitalize on task-specific global topic information in an unsupervised manner to assist PBERT in N-best hypothesis reranking (denoted
- Score: 15.762742686665652
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: More recently, Bidirectional Encoder Representations from Transformers (BERT)
was proposed and has achieved impressive success on many natural language
processing (NLP) tasks such as question answering and language understanding,
due mainly to its effective pre-training then fine-tuning paradigm as well as
strong local contextual modeling ability. In view of the above, this paper
presents a novel instantiation of the BERT-based contextualized language models
(LMs) for use in reranking of N-best hypotheses produced by automatic speech
recognition (ASR). To this end, we frame N-best hypothesis reranking with BERT
as a prediction problem, which aims to predict the oracle hypothesis that has
the lowest word error rate (WER) given the N-best hypotheses (denoted by
PBERT). In particular, we also explore to capitalize on task-specific global
topic information in an unsupervised manner to assist PBERT in N-best
hypothesis reranking (denoted by TPBERT). Extensive experiments conducted on
the AMI benchmark corpus demonstrate the effectiveness and feasibility of our
methods in comparison to the conventional autoregressive models like the
recurrent neural network (RNN) and a recently proposed method that employed
BERT to compute pseudo-log-likelihood (PLL) scores for N-best hypothesis
reranking.
Related papers
- Enhancing adversarial robustness in Natural Language Inference using explanations [41.46494686136601]
We cast the spotlight on the underexplored task of Natural Language Inference (NLI)
We validate the usage of natural language explanation as a model-agnostic defence strategy through extensive experimentation.
We research the correlation of widely used language generation metrics with human perception, in order for them to serve as a proxy towards robust NLI models.
arXiv Detail & Related papers (2024-09-11T17:09:49Z) - HyPoradise: An Open Baseline for Generative Speech Recognition with
Large Language Models [81.56455625624041]
We introduce the first open-source benchmark to utilize external large language models (LLMs) for ASR error correction.
The proposed benchmark contains a novel dataset, HyPoradise (HP), encompassing more than 334,000 pairs of N-best hypotheses.
LLMs with reasonable prompt and its generative capability can even correct those tokens that are missing in N-best list.
arXiv Detail & Related papers (2023-09-27T14:44:10Z) - A Semi-Bayesian Nonparametric Estimator of the Maximum Mean Discrepancy
Measure: Applications in Goodness-of-Fit Testing and Generative Adversarial
Networks [3.623570119514559]
We propose a semi-Bayesian nonparametric (semi-BNP) procedure for the goodness-of-fit (GOF) test.
Our method introduces a novel Bayesian estimator for the maximum mean discrepancy (MMD) measure.
We demonstrate that our proposed test outperforms frequentist MMD-based methods by achieving a lower false rejection and acceptance rate of the null hypothesis.
arXiv Detail & Related papers (2023-03-05T10:36:21Z) - Self-Normalized Importance Sampling for Neural Language Modeling [97.96857871187052]
In this work, we propose self-normalized importance sampling. Compared to our previous work, the criteria considered in this work are self-normalized and there is no need to further conduct a correction step.
We show that our proposed self-normalized importance sampling is competitive in both research-oriented and production-oriented automatic speech recognition tasks.
arXiv Detail & Related papers (2021-11-11T16:57:53Z) - NoiER: An Approach for Training more Reliable Fine-TunedDownstream Task
Models [54.184609286094044]
We propose noise entropy regularisation (NoiER) as an efficient learning paradigm that solves the problem without auxiliary models and additional data.
The proposed approach improved traditional OOD detection evaluation metrics by 55% on average compared to the original fine-tuned models.
arXiv Detail & Related papers (2021-08-29T06:58:28Z) - Cross-sentence Neural Language Models for Conversational Speech
Recognition [17.317583079824423]
We propose an effective cross-sentence neural LM approach that reranks the ASR N-best hypotheses of an upcoming sentence.
We also explore to extract task-specific global topical information of the cross-sentence history.
arXiv Detail & Related papers (2021-06-13T05:30:16Z) - Explaining the Deep Natural Language Processing by Mining Textual
Interpretable Features [3.819533618886143]
T-EBAnO is a prediction-local and class-based model-global explanation strategies tailored to deep natural-language models.
It provides an objective, human-readable, domain-specific assessment of the reasons behind the automatic decision-making process.
arXiv Detail & Related papers (2021-06-12T06:25:09Z) - Pre-training Is (Almost) All You Need: An Application to Commonsense
Reasoning [61.32992639292889]
Fine-tuning of pre-trained transformer models has become the standard approach for solving common NLP tasks.
We introduce a new scoring method that casts a plausibility ranking task in a full-text format.
We show that our method provides a much more stable training phase across random restarts.
arXiv Detail & Related papers (2020-04-29T10:54:40Z) - Stochastic-Sign SGD for Federated Learning with Theoretical Guarantees [49.91477656517431]
Quantization-based solvers have been widely adopted in Federated Learning (FL)
No existing methods enjoy all the aforementioned properties.
We propose an intuitively-simple yet theoretically-simple method based on SIGNSGD to bridge the gap.
arXiv Detail & Related papers (2020-02-25T15:12:15Z) - Joint Contextual Modeling for ASR Correction and Language Understanding [60.230013453699975]
We propose multi-task neural approaches to perform contextual language correction on ASR outputs jointly with language understanding (LU)
We show that the error rates of off the shelf ASR and following LU systems can be reduced significantly by 14% relative with joint models trained using small amounts of in-domain data.
arXiv Detail & Related papers (2020-01-28T22:09:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.