Optimal Model Averaging: Towards Personalized Collaborative Learning
- URL: http://arxiv.org/abs/2110.12946v1
- Date: Mon, 25 Oct 2021 13:33:20 GMT
- Title: Optimal Model Averaging: Towards Personalized Collaborative Learning
- Authors: Felix Grimberg (1), Mary-Anne Hartley (1), Sai P. Karimireddy (1),
Martin Jaggi (1) ((1) EPFL)
- Abstract summary: In federated learning, differences in the data or objectives between the participating nodes motivate approaches to train a personalized machine learning model for each node.
One such approach is weighted averaging between a locally trained model and the global model.
We find that there is always some positive amount of model averaging that reduces the expected squared error compared to the local model.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In federated learning, differences in the data or objectives between the
participating nodes motivate approaches to train a personalized machine
learning model for each node. One such approach is weighted averaging between a
locally trained model and the global model. In this theoretical work, we study
weighted model averaging for arbitrary scalar mean estimation problems under
minimal assumptions on the distributions. In a variant of the bias-variance
trade-off, we find that there is always some positive amount of model averaging
that reduces the expected squared error compared to the local model, provided
only that the local model has a non-zero variance. Further, we quantify the
(possibly negative) benefit of weighted model averaging as a function of the
weight used and the optimal weight. Taken together, this work formalizes an
approach to quantify the value of personalization in collaborative learning and
provides a framework for future research to test the findings in multivariate
parameter estimation and under a range of assumptions.
Related papers
- WASH: Train your Ensemble with Communication-Efficient Weight Shuffling, then Average [21.029085451757368]
Weight averaging methods aim at balancing the generalization of ensembling and the inference speed of a single model.
We introduce WASH, a novel distributed method for training model ensembles for weight averaging that achieves state-of-the-art image classification accuracy.
arXiv Detail & Related papers (2024-05-27T09:02:57Z) - Vanishing Variance Problem in Fully Decentralized Neural-Network Systems [0.8212195887472242]
Federated learning and gossip learning are emerging methodologies designed to mitigate data privacy concerns.
Our research introduces a variance-corrected model averaging algorithm.
Our simulation results demonstrate that our approach enables gossip learning to achieve convergence efficiency comparable to that of federated learning.
arXiv Detail & Related papers (2024-04-06T12:49:20Z) - Aggregation Weighting of Federated Learning via Generalization Bound
Estimation [65.8630966842025]
Federated Learning (FL) typically aggregates client model parameters using a weighting approach determined by sample proportions.
We replace the aforementioned weighting method with a new strategy that considers the generalization bounds of each local model.
arXiv Detail & Related papers (2023-11-10T08:50:28Z) - Distributed Personalized Empirical Risk Minimization [19.087524494290676]
This paper advocates a new paradigm Personalized Empirical Risk Minimization (PERM) to facilitate learning from heterogeneous data sources.
We propose a distributed algorithm that replaces the standard model averaging with model shuffling to simultaneously optimize PERM objectives for all devices.
arXiv Detail & Related papers (2023-10-26T20:07:33Z) - Ensemble Modeling for Multimodal Visual Action Recognition [50.38638300332429]
We propose an ensemble modeling approach for multimodal action recognition.
We independently train individual modality models using a variant of focal loss tailored to handle the long-tailed distribution of the MECCANO [21] dataset.
arXiv Detail & Related papers (2023-08-10T08:43:20Z) - Improving Heterogeneous Model Reuse by Density Estimation [105.97036205113258]
This paper studies multiparty learning, aiming to learn a model using the private data of different participants.
Model reuse is a promising solution for multiparty learning, assuming that a local model has been trained for each party.
arXiv Detail & Related papers (2023-05-23T09:46:54Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Learning Diverse Representations for Fast Adaptation to Distribution
Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task.
We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z) - Decision-Making with Auto-Encoding Variational Bayes [71.44735417472043]
We show that a posterior approximation distinct from the variational distribution should be used for making decisions.
Motivated by these theoretical results, we propose learning several approximate proposals for the best model.
In addition to toy examples, we present a full-fledged case study of single-cell RNA sequencing.
arXiv Detail & Related papers (2020-02-17T19:23:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.