Confidence-based Ensembles of End-to-End Speech Recognition Models
- URL: http://arxiv.org/abs/2306.15824v1
- Date: Tue, 27 Jun 2023 23:13:43 GMT
- Title: Confidence-based Ensembles of End-to-End Speech Recognition Models
- Authors: Igor Gitman, Vitaly Lavrukhin, Aleksandr Laptev, Boris Ginsburg
- Abstract summary: We show that a confidence-based ensemble of 5 monolingual models outperforms a system where model selection is performed via a dedicated language identification block.
We also demonstrate that it is possible to combine base and adapted models to achieve strong results on both original and target data.
- Score: 71.65982591023581
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The number of end-to-end speech recognition models grows every year. These
models are often adapted to new domains or languages resulting in a
proliferation of expert systems that achieve great results on target data,
while generally showing inferior performance outside of their domain of
expertise. We explore combination of such experts via confidence-based
ensembles: ensembles of models where only the output of the most-confident
model is used. We assume that models' target data is not available except for a
small validation set. We demonstrate effectiveness of our approach with two
applications. First, we show that a confidence-based ensemble of 5 monolingual
models outperforms a system where model selection is performed via a dedicated
language identification block. Second, we demonstrate that it is possible to
combine base and adapted models to achieve strong results on both original and
target data. We validate all our results on multiple datasets and model
architectures.
Related papers
- Knowledge Fusion By Evolving Weights of Language Models [5.354527640064584]
This paper examines the approach of integrating multiple models into a unified model.
We propose a knowledge fusion method named Evolver, inspired by evolutionary algorithms.
arXiv Detail & Related papers (2024-06-18T02:12:34Z) - Has Your Pretrained Model Improved? A Multi-head Posterior Based
Approach [25.927323251675386]
We leverage the meta-features associated with each entity as a source of worldly knowledge and employ entity representations from the models.
We propose using the consistency between these representations and the meta-features as a metric for evaluating pre-trained models.
Our method's effectiveness is demonstrated across various domains, including models with relational datasets, large language models and image models.
arXiv Detail & Related papers (2024-01-02T17:08:26Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - Preserving Knowledge Invariance: Rethinking Robustness Evaluation of
Open Information Extraction [50.62245481416744]
We present the first benchmark that simulates the evaluation of open information extraction models in the real world.
We design and annotate a large-scale testbed in which each example is a knowledge-invariant clique.
By further elaborating the robustness metric, a model is judged to be robust if its performance is consistently accurate on the overall cliques.
arXiv Detail & Related papers (2023-05-23T12:05:09Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Artificial Interrogation for Attributing Language Models [0.0]
The challenge provides twelve open-sourced base versions of popular language models and twelve fine-tuned language models for text generation.
The goal of the contest is to identify which fine-tuned models originated from which base model.
We have employed four distinct approaches for measuring the resemblance between the responses generated from the models of both sets.
arXiv Detail & Related papers (2022-11-20T05:46:29Z) - Language Models are General-Purpose Interfaces [109.45478241369655]
We propose to use language models as a general-purpose interface to various foundation models.
A collection of pretrained encoders perceive diverse modalities (such as vision, and language)
We propose a semi-causal language modeling objective to jointly pretrain the interface and the modular encoders.
arXiv Detail & Related papers (2022-06-13T17:34:22Z) - Data-driven Model Generalizability in Crosslinguistic Low-resource
Morphological Segmentation [4.339613097080119]
In low-resource scenarios, artifacts of the data collection can yield data sets that are outliers, potentially making conclusions about model performance coincidental.
We compare three broad classes of models with different parameterizations, taking data from 11 languages across 6 language families.
The results demonstrate that the extent of model generalization depends on the characteristics of the data set, and does not necessarily rely heavily on the data set size.
arXiv Detail & Related papers (2022-01-05T22:19:10Z) - Improving Label Quality by Jointly Modeling Items and Annotators [68.8204255655161]
We propose a fully Bayesian framework for learning ground truth labels from noisy annotators.
Our framework ensures scalability by factoring a generative, Bayesian soft clustering model over label distributions into the classic David and Skene joint annotator-data model.
arXiv Detail & Related papers (2021-06-20T02:15:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.