Federated Distillation of Natural Language Understanding with Confident
Sinkhorns
- URL: http://arxiv.org/abs/2110.02432v1
- Date: Wed, 6 Oct 2021 00:44:00 GMT
- Title: Federated Distillation of Natural Language Understanding with Confident
Sinkhorns
- Authors: Rishabh Bhardwaj, Tushar Vaidya, Soujanya Poria
- Abstract summary: We propose an approach to learn a central (global) model from the federation of (local) models trained on user-devices.
To learn the global model, the objective is to minimize the optimal transport cost of the global model's predictions from the confident sum of soft-targets assigned by local models.
- Score: 12.681983862338619
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Enhancing the user experience is an essential task for application service
providers. For instance, two users living wide apart may have different tastes
of food. A food recommender mobile application installed on an edge device
might want to learn from user feedback (reviews) to satisfy the client's needs
pertaining to distinct domains. Retrieving user data comes at the cost of
privacy while asking for model parameters trained on a user device becomes
space inefficient at a large scale. In this work, we propose an approach to
learn a central (global) model from the federation of (local) models which are
trained on user-devices, without disclosing the local data or model parameters
to the server. We propose a federation mechanism for the problems with natural
similarity metric between the labels which commonly appear in natural language
understanding (NLU) tasks. To learn the global model, the objective is to
minimize the optimal transport cost of the global model's predictions from the
confident sum of soft-targets assigned by local models. The confidence (a model
weighting scheme) score of a model is defined as the L2 distance of a model's
prediction from its probability bias. The method improves the global model's
performance over the baseline designed on three NLU tasks with intrinsic label
space semantics, i.e., fine-grained sentiment analysis, emotion recognition in
conversation, and natural language inference. We make our codes public at
https://github.com/declare-lab/sinkhorn-loss.
Related papers
- Tunable Soft Prompts are Messengers in Federated Learning [55.924749085481544]
Federated learning (FL) enables multiple participants to collaboratively train machine learning models using decentralized data sources.
The lack of model privacy protection in FL becomes an unneglectable challenge.
We propose a novel FL training approach that accomplishes information exchange among participants via tunable soft prompts.
arXiv Detail & Related papers (2023-11-12T11:01:10Z) - Rethinking Client Drift in Federated Learning: A Logit Perspective [125.35844582366441]
Federated Learning (FL) enables multiple clients to collaboratively learn in a distributed way, allowing for privacy protection.
We find that the difference in logits between the local and global models increases as the model is continuously updated.
We propose a new algorithm, named FedCSD, a Class prototype Similarity Distillation in a federated framework to align the local and global models.
arXiv Detail & Related papers (2023-08-20T04:41:01Z) - Federated Select: A Primitive for Communication- and Memory-Efficient
Federated Learning [4.873569522869751]
Federated learning (FL) is a framework for machine learning across heterogeneous client devices.
We propose a more general procedure in which clients "select" what values are sent to them.
This allows clients to operate on smaller, data-dependent slices.
arXiv Detail & Related papers (2022-08-19T16:26:03Z) - Federated Split GANs [12.007429155505767]
We propose an alternative approach to train ML models in user's devices themselves.
We focus on GANs (generative adversarial networks) and leverage their inherent privacy-preserving attribute.
Our system preserves data privacy, keeps a short training time, and yields same accuracy of model training in unconstrained devices.
arXiv Detail & Related papers (2022-07-04T23:53:47Z) - A Bayesian Federated Learning Framework with Online Laplace
Approximation [144.7345013348257]
Federated learning allows multiple clients to collaboratively learn a globally shared model.
We propose a novel FL framework that uses online Laplace approximation to approximate posteriors on both the client and server side.
We achieve state-of-the-art results on several benchmarks, clearly demonstrating the advantages of the proposed method.
arXiv Detail & Related papers (2021-02-03T08:36:58Z) - FedBE: Making Bayesian Model Ensemble Applicable to Federated Learning [23.726336635748783]
Federated learning aims to collaboratively train a strong global model by accessing users' locally trained models but not their own data.
A crucial step is therefore to aggregate local models into a global model, which has been shown challenging when users have non-i.i.d. data.
We propose a novel aggregation algorithm named FedBE, which takes a Bayesian inference perspective by sampling higher-quality global models.
arXiv Detail & Related papers (2020-09-04T01:18:25Z) - Information-Theoretic Bounds on the Generalization Error and Privacy
Leakage in Federated Learning [96.38757904624208]
Machine learning algorithms on mobile networks can be characterized into three different categories.
The main objective of this work is to provide an information-theoretic framework for all of the aforementioned learning paradigms.
arXiv Detail & Related papers (2020-05-05T21:23:45Z) - Parameter Space Factorization for Zero-Shot Learning across Tasks and
Languages [112.65994041398481]
We propose a Bayesian generative model for the space of neural parameters.
We infer the posteriors over such latent variables based on data from seen task-language combinations.
Our model yields comparable or better results than state-of-the-art, zero-shot cross-lingual transfer methods.
arXiv Detail & Related papers (2020-01-30T16:58:56Z) - Think Locally, Act Globally: Federated Learning with Local and Global
Representations [92.68484710504666]
Federated learning is a method of training models on private data distributed over multiple devices.
We propose a new federated learning algorithm that jointly learns compact local representations on each device.
We also evaluate on the task of personalized mood prediction from real-world mobile data where privacy is key.
arXiv Detail & Related papers (2020-01-06T12:40:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.