Modeling Dependent Structure for Utterances in ASR Evaluation
- URL: http://arxiv.org/abs/2209.05281v1
- Date: Wed, 7 Sep 2022 21:51:06 GMT
- Title: Modeling Dependent Structure for Utterances in ASR Evaluation
- Authors: Zhe Liu and Fuchun Peng
- Abstract summary: bootstrap resampling has been popular for performing significance analysis on word error rate (WER) in automatic speech recognition (ASR) evaluations.
blockwise bootstrap approach is also proposed that by dividing utterances into uncorrelated blocks, it resamples these blocks instead of original data.
We show that the resulting variance estimator for WER is consistent under mild conditions.
- Score: 16.559092192445917
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The bootstrap resampling method has been popular for performing significance
analysis on word error rate (WER) in automatic speech recognition (ASR)
evaluations. To deal with the issue of dependent speech data, the blockwise
bootstrap approach is also proposed that by dividing utterances into
uncorrelated blocks, it resamples these blocks instead of original data.
However, it is always nontrivial to uncover the dependent structure among
utterances, which could lead to subjective findings in statistical testing. In
this paper, we present graphical lasso based methods to explicitly model such
dependency and estimate the independent blocks of utterances in a rigorous way.
Then the blockwise bootstrap is applied on top of the inferred blocks. We show
that the resulting variance estimator for WER is consistent under mild
conditions. We also demonstrate the validity of proposed approach on
LibriSpeech data.
Related papers
- DenoSent: A Denoising Objective for Self-Supervised Sentence
Representation Learning [59.4644086610381]
We propose a novel denoising objective that inherits from another perspective, i.e., the intra-sentence perspective.
By introducing both discrete and continuous noise, we generate noisy sentences and then train our model to restore them to their original form.
Our empirical evaluations demonstrate that this approach delivers competitive results on both semantic textual similarity (STS) and a wide range of transfer tasks.
arXiv Detail & Related papers (2024-01-24T17:48:45Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - BASS: Block-wise Adaptation for Speech Summarization [47.518484305407185]
We develop a method that allows one to train summarization models on very long sequences in an incremental manner.
Speech summarization is realized as a streaming process, where hypothesis summaries are updated every block.
Experiments on the How2 dataset demonstrate that the proposed block-wise training method improves by 3 points absolute on ROUGE-L over a truncated input baseline.
arXiv Detail & Related papers (2023-07-17T03:31:36Z) - Bring Your Own Data! Self-Supervised Evaluation for Large Language
Models [52.15056231665816]
We propose a framework for self-supervised evaluation of Large Language Models (LLMs)
We demonstrate self-supervised evaluation strategies for measuring closed-book knowledge, toxicity, and long-range context dependence.
We find strong correlations between self-supervised and human-supervised evaluations.
arXiv Detail & Related papers (2023-06-23T17:59:09Z) - Zero-Shot Automatic Pronunciation Assessment [19.971348810774046]
We propose a novel zero-shot APA method based on the pre-trained acoustic model, HuBERT.
Experimental results on speechocean762 demonstrate that the proposed method achieves comparable performance to supervised regression baselines.
arXiv Detail & Related papers (2023-05-31T05:17:17Z) - Robust Outlier Rejection for 3D Registration with Variational Bayes [70.98659381852787]
We develop a novel variational non-local network-based outlier rejection framework for robust alignment.
We propose a voting-based inlier searching strategy to cluster the high-quality hypothetical inliers for transformation estimation.
arXiv Detail & Related papers (2023-04-04T03:48:56Z) - AB/BA analysis: A framework for estimating keyword spotting recall
improvement while maintaining audio privacy [0.0]
KWS is designed to only collect data when the keyword is present, limiting the availability of hard samples that may contain false negatives.
We propose an evaluation technique which we call AB/BA analysis.
We show that AB/BA analysis is successful at measuring recall improvement in conjunction with the trade-off in relative false positive rate.
arXiv Detail & Related papers (2022-04-18T13:52:22Z) - Statistical Estimation from Dependent Data [37.73584699735133]
We consider a general statistical estimation problem wherein binary labels across different observations are not independent conditioned on their feature vectors.
We model these dependencies in the language of Markov Random Fields.
We provide algorithms and statistically efficient estimation rates for this model.
arXiv Detail & Related papers (2021-07-20T21:18:06Z) - Exploiting Sample Uncertainty for Domain Adaptive Person
Re-Identification [137.9939571408506]
We estimate and exploit the credibility of the assigned pseudo-label of each sample to alleviate the influence of noisy labels.
Our uncertainty-guided optimization brings significant improvement and achieves the state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2020-12-16T04:09:04Z) - TSInsight: A local-global attribution framework for interpretability in
time-series data [5.174367472975529]
We propose an auto-encoder to the classifier with a sparsity-inducing norm on its output and fine-tune it based on the gradients from the classifier and a reconstruction penalty.
TSInsight learns to preserve features that are important for prediction by the classifier and suppresses those that are irrelevant.
In contrast to most other attribution frameworks, TSInsight is capable of generating both instance-based and model-based explanations.
arXiv Detail & Related papers (2020-04-06T19:34:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.