Distributionally Robust Federated Averaging
- URL: http://arxiv.org/abs/2102.12660v1
- Date: Thu, 25 Feb 2021 03:32:09 GMT
- Title: Distributionally Robust Federated Averaging
- Authors: Yuyang Deng, Mohammad Mahdi Kamani, Mehrdad Mahdavi
- Abstract summary: We present communication efficient distributed algorithms for robust learning periodic averaging with adaptive sampling.
We give corroborating experimental evidence for our theoretical results in federated learning settings.
- Score: 19.875176871167966
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we study communication efficient distributed algorithms for
distributionally robust federated learning via periodic averaging with adaptive
sampling. In contrast to standard empirical risk minimization, due to the
minimax structure of the underlying optimization problem, a key difficulty
arises from the fact that the global parameter that controls the mixture of
local losses can only be updated infrequently on the global stage. To
compensate for this, we propose a Distributionally Robust Federated Averaging
(DRFA) algorithm that employs a novel snapshotting scheme to approximate the
accumulation of history gradients of the mixing parameter. We analyze the
convergence rate of DRFA in both convex-linear and nonconvex-linear settings.
We also generalize the proposed idea to objectives with regularization on the
mixture parameter and propose a proximal variant, dubbed as DRFA-Prox, with
provable convergence rates. We also analyze an alternative optimization method
for regularized cases in strongly-convex-strongly-concave and non-convex (under
PL condition)-strongly-concave settings. To the best of our knowledge, this
paper is the first to solve distributionally robust federated learning with
reduced communication, and to analyze the efficiency of local descent methods
on distributed minimax problems. We give corroborating experimental evidence
for our theoretical results in federated learning settings.
Related papers
- PARL: A Unified Framework for Policy Alignment in Reinforcement Learning from Human Feedback [106.63518036538163]
We present a novel unified bilevel optimization-based framework, textsfPARL, formulated to address the recently highlighted critical issue of policy alignment in reinforcement learning.
Our framework addressed these concerns by explicitly parameterizing the distribution of the upper alignment objective (reward design) by the lower optimal variable.
Our empirical results substantiate that the proposed textsfPARL can address the alignment concerns in RL by showing significant improvements.
arXiv Detail & Related papers (2023-08-03T18:03:44Z) - Distributed Distributionally Robust Optimization with Non-Convex
Objectives [24.64654924173679]
Asynchronous distributed algorithm named Asynchronous Single-looP alternatIve gRadient projEction is proposed.
New uncertainty set, i.e., constrained D-norm uncertainty set, is developed to leverage the prior distribution and flexibly control the degree of robustness.
empirical studies on real-world datasets demonstrate that the proposed method can not only achieve fast convergence, but also remain robust against data as well as malicious attacks.
arXiv Detail & Related papers (2022-10-14T07:39:13Z) - Asymptotically Unbiased Instance-wise Regularized Partial AUC
Optimization: Theory and Algorithm [101.44676036551537]
One-way Partial AUC (OPAUC) and Two-way Partial AUC (TPAUC) measures the average performance of a binary classifier.
Most of the existing methods could only optimize PAUC approximately, leading to inevitable biases that are not controllable.
We present a simpler reformulation of the PAUC problem via distributional robust optimization AUC.
arXiv Detail & Related papers (2022-10-08T08:26:22Z) - Depersonalized Federated Learning: Tackling Statistical Heterogeneity by
Alternating Stochastic Gradient Descent [6.394263208820851]
Federated learning (FL) enables devices to train a common machine learning (ML) model for intelligent inference without data sharing.
Raw data held by various cooperativelyicipators are always non-identically distributedly.
We propose a new FL that can significantly statistical optimize by the de-speed of this process.
arXiv Detail & Related papers (2022-10-07T10:30:39Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Escaping Saddle Points with Bias-Variance Reduced Local Perturbed SGD
for Communication Efficient Nonconvex Distributed Learning [58.79085525115987]
Local methods are one of the promising approaches to reduce communication time.
We show that the communication complexity is better than non-local methods when the local datasets is smaller than the smoothness local loss.
arXiv Detail & Related papers (2022-02-12T15:12:17Z) - Distributionally Robust Fair Principal Components via Geodesic Descents [16.440434996206623]
In consequential domains such as college admission, healthcare and credit approval, it is imperative to take into account emerging criteria such as the fairness and the robustness of the learned projection.
We propose a distributionally robust optimization problem for principal component analysis which internalizes a fairness criterion in the objective function.
Our experimental results on real-world datasets show the merits of our proposed method over state-of-the-art baselines.
arXiv Detail & Related papers (2022-02-07T11:08:13Z) - Distributed and Stochastic Optimization Methods with Gradient
Compression and Local Steps [0.0]
We propose theoretical frameworks for the analysis and distributed methods with error compensation and local updates.
We develop more than 20 new optimization methods, including the first linearly converging Error-pensated and first distributed Local-SGD methods.
arXiv Detail & Related papers (2021-12-20T16:12:54Z) - KL Guided Domain Adaptation [88.19298405363452]
Domain adaptation is an important problem and often needed for real-world applications.
A common approach in the domain adaptation literature is to learn a representation of the input that has the same distributions over the source and the target domain.
We show that with a probabilistic representation network, the KL term can be estimated efficiently via minibatch samples.
arXiv Detail & Related papers (2021-06-14T22:24:23Z) - Towards Optimal Problem Dependent Generalization Error Bounds in
Statistical Learning Theory [11.840747467007963]
We study problem-dependent rates that scale near-optimally with the variance, the effective loss errors, or the norms evaluated at the "best gradient hypothesis"
We introduce a principled framework dubbed "uniform localized convergence"
We show that our framework resolves several fundamental limitations of existing uniform convergence and localization analysis approaches.
arXiv Detail & Related papers (2020-11-12T04:07:29Z) - Log-Likelihood Ratio Minimizing Flows: Towards Robust and Quantifiable
Neural Distribution Alignment [52.02794488304448]
We propose a new distribution alignment method based on a log-likelihood ratio statistic and normalizing flows.
We experimentally verify that minimizing the resulting objective results in domain alignment that preserves the local structure of input domains.
arXiv Detail & Related papers (2020-03-26T22:10:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.