Related papers: Wakeword Detection under Distribution Shifts

Wakeword Detection under Distribution Shifts

URL: http://arxiv.org/abs/2207.06423v1
Date: Wed, 13 Jul 2022 17:35:08 GMT
Title: Wakeword Detection under Distribution Shifts
Authors: Sree Hari Krishnan Parthasarathi, Lu Zeng, Christin Jose, Joseph Wang
Abstract summary: We propose a novel approach for semi-supervised learning (SSL) designed to overcome distribution shifts between training and real-world data. We develop a teacher labeling strategy based on confidences to reduce entropy on the label distribution from the teacher model.
Score: 4.128269694687
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose a novel approach for semi-supervised learning (SSL) designed to overcome distribution shifts between training and real-world data arising in the keyword spotting (KWS) task. Shifts from training data distribution are a key challenge for real-world KWS tasks: when a new model is deployed on device, the gating of the accepted data undergoes a shift in distribution, making the problem of timely updates via subsequent deployments hard. Despite the shift, we assume that the marginal distributions on labels do not change. We utilize a modified teacher/student training framework, where labeled training data is augmented with unlabeled data. Note that the teacher does not have access to the new distribution as well. To train effectively with a mix of human and teacher labeled data, we develop a teacher labeling strategy based on confidence heuristics to reduce entropy on the label distribution from the teacher model; the data is then sampled to match the marginal distribution on the labels. Large scale experimental results show that a convolutional neural network (CNN) trained on far-field audio, and evaluated on far-field audio drawn from a different distribution, obtains a 14.3% relative improvement in false discovery rate (FDR) at equal false reject rate (FRR), while yielding a 5% improvement in FDR under no distribution shift. Under a more severe distribution shift from far-field to near-field audio with a smaller fully connected network (FCN) our approach achieves a 52% relative improvement in FDR at equal FRR, while yielding a 20% relative improvement in FDR on the original distribution.

Related papers

Technical note on Fisher Information for Robust Federated Cross-Validation [3.5808917363708743]
We propose Fisher Information for Robust fEderated validation (textbfFIRE)<n>Fire outperforms importance weighting benchmarks by $5.1%$ at maximum and federated learning benchmarks by up to $5.3%$ on shifted validation sets.
arXiv Detail & Related papers (2025-10-04T15:30:04Z)
Learning Neural Networks with Distribution Shift: Efficiently Certifiable Guarantees [13.936051653540144]
We give the first. efficient algorithms for learning neural networks with a distribution shift. We work in the Testable Learning with Distribution Shift framework.
arXiv Detail & Related papers (2025-02-22T00:48:03Z)
Continuous Contrastive Learning for Long-Tailed Semi-Supervised Recognition [50.61991746981703]
Current state-of-the-art LTSSL approaches rely on high-quality pseudo-labels for large-scale unlabeled data. This paper introduces a novel probabilistic framework that unifies various recent proposals in long-tail learning. We introduce a continuous contrastive learning method, CCL, extending our framework to unlabeled data using reliable and smoothed pseudo-labels.
arXiv Detail & Related papers (2024-10-08T15:06:10Z)
Exploring Vacant Classes in Label-Skewed Federated Learning [113.65301899666645]
This paper introduces FedVLS, a novel approach to label-skewed federated learning. It integrates vacant-class distillation and logit suppression simultaneously. Experiments validate the efficacy of FedVLS, demonstrating superior performance compared to previous state-of-the-art (SOTA) methods.
arXiv Detail & Related papers (2024-01-04T16:06:31Z)
Dr. FERMI: A Stochastic Distributionally Robust Fair Empirical Risk Minimization Framework [12.734559823650887]
In the presence of distribution shifts, fair machine learning models may behave unfairly on test data. Existing algorithms require full access to data and cannot be used when small batches are used. This paper proposes the first distributionally robust fairness framework with convergence guarantees that do not require knowledge of the causal graph.
arXiv Detail & Related papers (2023-09-20T23:25:28Z)
Chasing Fairness Under Distribution Shift: A Model Weight Perturbation Approach [72.19525160912943]
We first theoretically demonstrate the inherent connection between distribution shift, data perturbation, and model weight perturbation. We then analyze the sufficient conditions to guarantee fairness for the target dataset. Motivated by these sufficient conditions, we propose robust fairness regularization (RFR)
arXiv Detail & Related papers (2023-03-06T17:19:23Z)
Integrating Local Real Data with Global Gradient Prototypes for Classifier Re-Balancing in Federated Long-Tailed Learning [60.41501515192088]
Federated Learning (FL) has become a popular distributed learning paradigm that involves multiple clients training a global model collaboratively. The data samples usually follow a long-tailed distribution in the real world, and FL on the decentralized and long-tailed data yields a poorly-behaved global model. In this work, we integrate the local real data with the global gradient prototypes to form the local balanced datasets.
arXiv Detail & Related papers (2023-01-25T03:18:10Z)
Tackling Instance-Dependent Label Noise with Dynamic Distribution Calibration [18.59803726676361]
Instance-dependent label noise is realistic but rather challenging, where the label-corruption process depends on instances directly. It causes a severe distribution shift between the distributions of training and test data, which impairs the generalization of trained models. In this paper, to address the distribution shift in learning with instance-dependent label noise, a dynamic distribution-calibration strategy is adopted.
arXiv Detail & Related papers (2022-10-11T03:50:52Z)
Learnable Distribution Calibration for Few-Shot Class-Incremental Learning [122.2241120474278]
Few-shot class-incremental learning (FSCIL) faces challenges of memorizing old class distributions and estimating new class distributions given few training samples. We propose a learnable distribution calibration (LDC) approach, with the aim to systematically solve these two challenges using a unified framework.
arXiv Detail & Related papers (2022-10-01T09:40:26Z)
Federated Learning with Label Distribution Skew via Logits Calibration [26.98248192651355]
In this paper, we investigate the label distribution skew in FL, where the distribution of labels varies across clients. We propose FedLC, which calibrates the logits before softmax cross-entropy according to the probability of occurrence of each class. Experiments on federated datasets and real-world datasets demonstrate that FedLC leads to a more accurate global model.
arXiv Detail & Related papers (2022-09-01T02:56:39Z)
How Robust is Your Fairness? Evaluating and Sustaining Fairness under Unseen Distribution Shifts [107.72786199113183]
We propose a novel fairness learning method termed CUrvature MAtching (CUMA) CUMA achieves robust fairness generalizable to unseen domains with unknown distributional shifts. We evaluate our method on three popular fairness datasets.
arXiv Detail & Related papers (2022-07-04T02:37:50Z)
Confidence May Cheat: Self-Training on Graph Neural Networks under Distribution Shift [39.73304203101909]
Self-training methods have been widely adopted on graphs by labeling high-confidence unlabeled nodes and then adding them to the training step. We propose a novel Distribution Recovered Graph Self-Training framework (DR- GST), which could recover the distribution of the original labeled dataset. Both our theoretical analysis and extensive experiments on five benchmark datasets demonstrate the effectiveness of the proposed DR- GST.
arXiv Detail & Related papers (2022-01-27T07:12:27Z)
Learning Calibrated Uncertainties for Domain Shift: A Distributionally Robust Learning Approach [150.8920602230832]
We propose a framework for learning calibrated uncertainties under domain shifts. In particular, the density ratio estimation reflects the closeness of a target (test) sample to the source (training) distribution. We show that our proposed method generates calibrated uncertainties that benefit downstream tasks.
arXiv Detail & Related papers (2020-10-08T02:10:54Z)
Robust Federated Learning: The Case of Affine Distribution Shifts [41.27887358989414]
We develop a robust federated learning algorithm that achieves satisfactory performance against distribution shifts in users' samples. We show that an affine distribution shift indeed suffices to significantly decrease the performance of the learnt classifier in a new test user.
arXiv Detail & Related papers (2020-06-16T03:43:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.