Related papers: Training Speech Recognition Models with Federated Learning: A Quality/Cost Framework

Training Speech Recognition Models with Federated Learning: A Quality/Cost Framework

URL: http://arxiv.org/abs/2010.15965v2
Date: Fri, 14 May 2021 18:49:19 GMT
Title: Training Speech Recognition Models with Federated Learning: A Quality/Cost Framework
Authors: Dhruv Guliani, Francoise Beaufays, Giovanni Motta
Abstract summary: We propose using federated learning, a decentralized on-device learning paradigm, to train speech recognition models. By performing epochs of training on a per-user basis, federated learning must incur the cost of dealing with non-IID data distributions.
Score: 4.125187280299247
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose using federated learning, a decentralized on-device learning paradigm, to train speech recognition models. By performing epochs of training on a per-user basis, federated learning must incur the cost of dealing with non-IID data distributions, which are expected to negatively affect the quality of the trained model. We propose a framework by which the degree of non-IID-ness can be varied, consequently illustrating a trade-off between model quality and the computational cost of federated training, which we capture through a novel metric. Finally, we demonstrate that hyper-parameter optimization and appropriate use of variational noise are sufficient to compensate for the quality impact of non-IID distributions, while decreasing the cost.

Related papers

Feasible Learning [78.6167929413604]
We introduce Feasible Learning (FL), a sample-centric learning paradigm where models are trained by solving a feasibility problem that bounds the loss for each training sample. Our empirical analysis, spanning image classification, age regression, and preference optimization in large language models, demonstrates that models trained via FL can learn from data while displaying improved tail behavior compared to ERM, with only a marginal impact on average performance.
arXiv Detail & Related papers (2025-01-24T20:39:38Z)
A Conformal Approach to Feature-based Newsvendor under Model Misspecification [2.801095519296785]
We propose a model-free and distribution-free framework inspired by conformal prediction. We validate our framework using both simulated data and a real-world dataset from the Capital Bikeshare program in Washington, D.C.
arXiv Detail & Related papers (2024-12-17T18:34:43Z)
Towards Robust Federated Learning via Logits Calibration on Non-IID Data [49.286558007937856]
Federated learning (FL) is a privacy-preserving distributed management framework based on collaborative model training of distributed devices in edge networks. Recent studies have shown that FL is vulnerable to adversarial examples, leading to a significant drop in its performance. In this work, we adopt the adversarial training (AT) framework to improve the robustness of FL models against adversarial example (AE) attacks.
arXiv Detail & Related papers (2024-03-05T09:18:29Z)
Dependable Distributed Training of Compressed Machine Learning Models [16.403297089086042]
We propose DepL, a framework for dependable learning orchestration. It makes high-quality, efficient decisions on (i) the data to leverage for learning, (ii) the models to use and when to switch among them, and (iii) the clusters of nodes, and the resources thereof, to exploit. We prove that DepL has constant competitive ratio and complexity, and show that it outperforms the state-of-the-art by over 27%.
arXiv Detail & Related papers (2024-02-22T07:24:26Z)
Federated Learning While Providing Model as a Service: Joint Training and Inference Optimization [30.305956110710266]
Federated learning is beneficial for enabling the training of models across distributed clients. Existing work has overlooked the coexistence of model training and inference under clients' limited resources. This paper focuses on the joint optimization of model training and inference to maximize inference performance at clients.
arXiv Detail & Related papers (2023-12-20T09:27:09Z)
Digital Twin-Assisted Knowledge Distillation Framework for Heterogeneous Federated Learning [14.003355837801879]
knowledge distillation (KD) driven training framework for federated learning is proposed. Each user can select its neural network model on demand and distill knowledge from a big teacher model using its own private dataset. Digital twin (DT) is exploit in the way that the teacher model can be trained at DT located in the server with enough computing resources.
arXiv Detail & Related papers (2023-03-10T15:14:24Z)
Post-hoc Uncertainty Learning using a Dirichlet Meta-Model [28.522673618527417]
We propose a novel Bayesian meta-model to augment pre-trained models with better uncertainty quantification abilities. Our proposed method requires no additional training data and is flexible enough to quantify different uncertainties. We demonstrate our proposed meta-model approach's flexibility and superior empirical performance on these applications.
arXiv Detail & Related papers (2022-12-14T17:34:11Z)
FairIF: Boosting Fairness in Deep Learning via Influence Functions with Validation Set Sensitive Attributes [51.02407217197623]
We propose a two-stage training algorithm named FAIRIF. It minimizes the loss over the reweighted data set where the sample weights are computed. We show that FAIRIF yields models with better fairness-utility trade-offs against various types of bias.
arXiv Detail & Related papers (2022-01-15T05:14:48Z)
NoiER: An Approach for Training more Reliable Fine-TunedDownstream Task Models [54.184609286094044]
We propose noise entropy regularisation (NoiER) as an efficient learning paradigm that solves the problem without auxiliary models and additional data. The proposed approach improved traditional OOD detection evaluation metrics by 55% on average compared to the original fine-tuned models.
arXiv Detail & Related papers (2021-08-29T06:58:28Z)
Model-Augmented Q-learning [112.86795579978802]
We propose a MFRL framework that is augmented with the components of model-based RL. Specifically, we propose to estimate not only the $Q$-values but also both the transition and the reward with a shared network. We show that the proposed scheme, called Model-augmented $Q$-learning (MQL), obtains a policy-invariant solution which is identical to the solution obtained by learning with true reward.
arXiv Detail & Related papers (2021-02-07T17:56:50Z)
Unsupervised neural adaptation model based on optimal transport for spoken language identification [54.96267179988487]
Due to the mismatch of statistical distributions of acoustic speech between training and testing sets, the performance of spoken language identification (SLID) could be drastically degraded. We propose an unsupervised neural adaptation model to deal with the distribution mismatch problem for SLID.
arXiv Detail & Related papers (2020-12-24T07:37:19Z)
Learning Diverse Representations for Fast Adaptation to Distribution Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task. We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z)
Training Keyword Spotting Models on Non-IID Data with Federated Learning [6.784774147680782]
We show that a production-quality keyword-spotting model can be trained on-device using federated learning. To overcome the algorithmic constraints associated with fitting on-device data, we conduct thorough empirical studies of optimization algorithms. We label examples (given the zero visibility into on-device data) to explore teacher-student training.
arXiv Detail & Related papers (2020-05-21T00:53:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.