Towards Robust Waveform-Based Acoustic Models
- URL: http://arxiv.org/abs/2110.08634v1
- Date: Sat, 16 Oct 2021 18:21:34 GMT
- Title: Towards Robust Waveform-Based Acoustic Models
- Authors: Dino Oglic, Zoran Cvetkovic, Peter Sollich, Steve Renals, and Bin Yu
- Abstract summary: We propose an approach for learning robust acoustic models in adverse environments, characterized by a significant mismatch between training and test conditions.
Our approach is an instance of vicinal risk minimization, which aims to improve risk estimates during training by replacing the delta functions that define the empirical density over the input space with an approximation of the marginal population density in the vicinity of the training samples.
- Score: 41.82019240477273
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose an approach for learning robust acoustic models in adverse
environments, characterized by a significant mismatch between training and test
conditions. This problem is of paramount importance for the deployment of
speech recognition systems that need to perform well in unseen environments.
Our approach is an instance of vicinal risk minimization, which aims to improve
risk estimates during training by replacing the delta functions that define the
empirical density over the input space with an approximation of the marginal
population density in the vicinity of the training samples. More specifically,
we assume that local neighborhoods centered at training samples can be
approximated using a mixture of Gaussians, and demonstrate theoretically that
this can incorporate robust inductive bias into the learning process. We
characterize the individual mixture components implicitly via data augmentation
schemes, designed to address common sources of spurious correlations in
acoustic models. To avoid potential confounding effects on robustness due to
information loss, which has been associated with standard feature extraction
techniques (e.g., FBANK and MFCC features), we focus our evaluation on the
waveform-based setting. Our empirical results show that the proposed approach
can generalize to unseen noise conditions, with 150% relative improvement in
out-of-distribution generalization compared to training using the standard risk
minimization principle. Moreover, the results demonstrate competitive
performance relative to models learned using a training sample designed to
match the acoustic conditions characteristic of test utterances (i.e., optimal
vicinal densities).
Related papers
- Disentangled Noisy Correspondence Learning [56.06801962154915]
Cross-modal retrieval is crucial in understanding latent correspondences across modalities.
DisNCL is a novel information-theoretic framework for feature Disentanglement in Noisy Correspondence Learning.
arXiv Detail & Related papers (2024-08-10T09:49:55Z) - Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets.
We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z) - The Risk of Federated Learning to Skew Fine-Tuning Features and
Underperform Out-of-Distribution Robustness [50.52507648690234]
Federated learning has the risk of skewing fine-tuning features and compromising the robustness of the model.
We introduce three robustness indicators and conduct experiments across diverse robust datasets.
Our approach markedly enhances the robustness across diverse scenarios, encompassing various parameter-efficient fine-tuning methods.
arXiv Detail & Related papers (2024-01-25T09:18:51Z) - May the Noise be with you: Adversarial Training without Adversarial
Examples [3.4673556247932225]
We investigate the question: Can we obtain adversarially-trained models without training on adversarial?
Our proposed approach incorporates inherentity by embedding Gaussian noise within the layers of the NN model at training time.
Our work contributes adversarially trained networks using a completely different approach, with empirically similar robustness to adversarial training.
arXiv Detail & Related papers (2023-12-12T08:22:28Z) - Noisy-ArcMix: Additive Noisy Angular Margin Loss Combined With Mixup
Anomalous Sound Detection [5.1308092683559225]
Unsupervised anomalous sound detection (ASD) aims to identify anomalous sounds by learning the features of normal operational sounds and sensing their deviations.
Recent approaches have focused on the self-supervised task utilizing the classification of normal data, and advanced models have shown that securing representation space for anomalous data is important.
We propose a training technique aimed at ensuring intra-class compactness and increasing the angle gap between normal and abnormal samples.
arXiv Detail & Related papers (2023-10-10T07:04:36Z) - Understanding and Mitigating the Label Noise in Pre-training on
Downstream Tasks [91.15120211190519]
This paper aims to understand the nature of noise in pre-training datasets and to mitigate its impact on downstream tasks.
We propose a light-weight black-box tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise.
arXiv Detail & Related papers (2023-09-29T06:18:15Z) - Learning with Noisy Labels through Learnable Weighting and Centroid Similarity [5.187216033152917]
noisy labels are prevalent in domains such as medical diagnosis and autonomous driving.
We introduce a novel method for training machine learning models in the presence of noisy labels.
Our results show that our method consistently outperforms the existing state-of-the-art techniques.
arXiv Detail & Related papers (2023-03-16T16:43:24Z) - Benchmarking common uncertainty estimation methods with
histopathological images under domain shift and label noise [62.997667081978825]
In high-risk environments, deep learning models need to be able to judge their uncertainty and reject inputs when there is a significant chance of misclassification.
We conduct a rigorous evaluation of the most commonly used uncertainty and robustness methods for the classification of Whole Slide Images.
We observe that ensembles of methods generally lead to better uncertainty estimates as well as an increased robustness towards domain shifts and label noise.
arXiv Detail & Related papers (2023-01-03T11:34:36Z) - Risk-Sensitive Reinforcement Learning with Exponential Criteria [0.0]
We provide a definition of robust reinforcement learning policies and formulate a risk-sensitive reinforcement learning problem to approximate them.
We introduce a novel online Actor-Critic algorithm based on solving a multiplicative Bellman equation using approximation updates.
The implementation, performance, and robustness properties of the proposed methods are evaluated in simulated experiments.
arXiv Detail & Related papers (2022-12-18T04:44:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.