Distributed Optimization over Block-Cyclic Data
- URL: http://arxiv.org/abs/2002.07454v1
- Date: Tue, 18 Feb 2020 09:47:15 GMT
- Title: Distributed Optimization over Block-Cyclic Data
- Authors: Yucheng Ding, Chaoyue Niu, Yikai Yan, Zhenzhe Zheng, Fan Wu, Guihai
Chen, Shaojie Tang, Rongfei Jia
- Abstract summary: We consider practical data characteristics underlying federated learning, where unbalanced and non-i.i.d. data from clients have a block-cyclic structure.
We propose two new distributed optimization algorithms called multi-model parallel SGD (MM-PSGD) and multi-chain parallel SGD (MC-PSGD)
Our algorithms significantly outperform the conventional federated averaging algorithm in terms of test accuracy, and also preserve robustness for the variance of critical parameters.
- Score: 48.317899174302305
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider practical data characteristics underlying federated learning,
where unbalanced and non-i.i.d. data from clients have a block-cyclic
structure: each cycle contains several blocks, and each client's training data
follow block-specific and non-i.i.d. distributions. Such a data structure would
introduce client and block biases during the collaborative training: the single
global model would be biased towards the client or block specific data. To
overcome the biases, we propose two new distributed optimization algorithms
called multi-model parallel SGD (MM-PSGD) and multi-chain parallel SGD
(MC-PSGD) with a convergence rate of $O(1/\sqrt{NT})$, achieving a linear
speedup with respect to the total number of clients. In particular, MM-PSGD
adopts the block-mixed training strategy, while MC-PSGD further adds the
block-separate training strategy. Both algorithms create a specific predictor
for each block by averaging and comparing the historical global models
generated in this block from different cycles. We extensively evaluate our
algorithms over the CIFAR-10 dataset. Evaluation results demonstrate that our
algorithms significantly outperform the conventional federated averaging
algorithm in terms of test accuracy, and also preserve robustness for the
variance of critical parameters.
Related papers
- MOLA: Enhancing Industrial Process Monitoring Using Multi-Block Orthogonal Long Short-Term Memory Autoencoder [3.7028696448588487]
We introduce MOLA: a Multi-block Orthogonal Long short-term memory Autoencoder paradigm, to conduct accurate, reliable fault detection of industrial processes.
We propose a multi-block monitoring structure, which categorizes the process variables into multiple blocks by leveraging expert process knowledge.
We demonstrate the efficiency and effectiveness of our MOLA framework by applying it to the Tennessee Eastman Process.
arXiv Detail & Related papers (2024-10-10T00:49:43Z) - Distributionally Robust Clustered Federated Learning: A Case Study in Healthcare [9.433126190164224]
We introduce a novel algorithm, which we term Cross-silo Robust Clustered Federated Learning (CS-RCFL)
We construct ambiguity sets around each client's empirical distribution that capture possible distribution shifts in the local data.
We then propose a model-agnostic integer fractional program to determine the optimal distributionally robust clustering of clients into coalitions.
arXiv Detail & Related papers (2024-10-09T16:25:01Z) - A Federated Distributionally Robust Support Vector Machine with Mixture of Wasserstein Balls Ambiguity Set for Distributed Fault Diagnosis [3.662364375995991]
We study the problem of training a distributionally robust (DR) support vector machine (SVM) in a federated fashion over a network comprised of a central server and $G$ clients without sharing data.
We propose two distributed optimization algorithms for training the global FDR-SVM.
arXiv Detail & Related papers (2024-10-04T19:21:45Z) - Distributed Collapsed Gibbs Sampler for Dirichlet Process Mixture Models
in Federated Learning [0.22499166814992444]
This paper proposes a new distributed Markov Chain Monte Carlo (MCMC) inference method for DPMMs (DisCGS) using sufficient statistics.
Our approach uses the collapsed Gibbs sampler and is specifically designed to work on distributed data across independent and heterogeneous machines.
For instance, with a dataset of 100K data points, the centralized algorithm requires approximately 12 hours to complete 100 iterations while our approach achieves the same number of iterations in just 3 minutes.
arXiv Detail & Related papers (2023-12-18T13:16:18Z) - Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data.
The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships.
A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z) - FedHB: Hierarchical Bayesian Federated Learning [11.936836827864095]
We propose a novel hierarchical Bayesian approach to Federated Learning (FL)
Our model reasonably describes the generative process of clients' local data via hierarchical Bayesian modeling.
We show that our block-coordinate FL algorithm converges to an optimum of the objective at the rate of $O(sqrtt)$.
arXiv Detail & Related papers (2023-05-08T18:21:41Z) - Personalized Federated Learning under Mixture of Distributions [98.25444470990107]
We propose a novel approach to Personalized Federated Learning (PFL), which utilizes Gaussian mixture models (GMM) to fit the input data distributions across diverse clients.
FedGMM possesses an additional advantage of adapting to new clients with minimal overhead, and it also enables uncertainty quantification.
Empirical evaluations on synthetic and benchmark datasets demonstrate the superior performance of our method in both PFL classification and novel sample detection.
arXiv Detail & Related papers (2023-05-01T20:04:46Z) - Differentially Private Federated Clustering over Non-IID Data [59.611244450530315]
clustering clusters (FedC) problem aims to accurately partition unlabeled data samples distributed over massive clients into finite clients under the orchestration of a server.
We propose a novel FedC algorithm using differential privacy convergence technique, referred to as DP-Fed, in which partial participation and multiple clients are also considered.
Various attributes of the proposed DP-Fed are obtained through theoretical analyses of privacy protection, especially for the case of non-identically and independently distributed (non-i.i.d.) data.
arXiv Detail & Related papers (2023-01-03T05:38:43Z) - Fed-CBS: A Heterogeneity-Aware Client Sampling Mechanism for Federated
Learning via Class-Imbalance Reduction [76.26710990597498]
We show that the class-imbalance of the grouped data from randomly selected clients can lead to significant performance degradation.
Based on our key observation, we design an efficient client sampling mechanism, i.e., Federated Class-balanced Sampling (Fed-CBS)
In particular, we propose a measure of class-imbalance and then employ homomorphic encryption to derive this measure in a privacy-preserving way.
arXiv Detail & Related papers (2022-09-30T05:42:56Z) - Robust Trimmed k-means [70.88503833248159]
We propose Robust Trimmed k-means (RTKM) that simultaneously identifies outliers and clusters points.
We show RTKM performs competitively with other methods on single membership data with outliers and multi-membership data without outliers.
arXiv Detail & Related papers (2021-08-16T15:49:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.