Convergence Analysis of Sequential Split Learning on Heterogeneous Data
- URL: http://arxiv.org/abs/2302.01633v3
- Date: Sat, 23 Dec 2023 03:45:12 GMT
- Title: Convergence Analysis of Sequential Split Learning on Heterogeneous Data
- Authors: Yipeng Li and Xinchen Lyu
- Abstract summary: Split Learning (SL) and Federated Averaging (Fed) are two popular paradigms distributed machine learning.
We derive convergence guarantees of SL/general SL/non-Avg on heterogeneous data.
We validate the counterintuitive analysis result empirically on extremely heterogeneous data.
- Score: 6.937859054591121
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated Learning (FL) and Split Learning (SL) are two popular paradigms of
distributed machine learning. By offloading the computation-intensive portions
to the server, SL is promising for deep model training on resource-constrained
devices, yet still lacking of rigorous convergence analysis. In this paper, we
derive the convergence guarantees of Sequential SL (SSL, the vanilla case of SL
that conducts the model training in sequence) for strongly/general/non-convex
objectives on heterogeneous data. Notably, the derived guarantees suggest that
SSL is better than Federated Averaging (FedAvg, the most popular algorithm in
FL) on heterogeneous data. We validate the counterintuitive analysis result
empirically on extremely heterogeneous data.
Related papers
- Double Machine Learning for Adaptive Causal Representation in High-Dimensional Data [14.25379577156518]
Support points sample splitting (SPSS) is employed for efficient double machine learning (DML) in causal inference.
The support points are selected and split as optimal representative points of the full raw data in a random sample.
They offer the best representation of a full big dataset, whereas the unit structural information of the underlying distribution via the traditional random data splitting is most likely not preserved.
arXiv Detail & Related papers (2024-11-22T01:54:53Z) - On Pretraining Data Diversity for Self-Supervised Learning [57.91495006862553]
We explore the impact of training with more diverse datasets on the performance of self-supervised learning (SSL) under a fixed computational budget.
Our findings consistently demonstrate that increasing pretraining data diversity enhances SSL performance, albeit only when the distribution distance to the downstream data is minimal.
arXiv Detail & Related papers (2024-03-20T17:59:58Z) - Understanding Representation Learnability of Nonlinear Self-Supervised
Learning [13.965135660149212]
Self-supervised learning (SSL) has empirically shown its data representation learnability in many downstream tasks.
Our paper is the first to analyze the learning results of the nonlinear SSL model accurately.
arXiv Detail & Related papers (2024-01-06T13:23:26Z) - Can semi-supervised learning use all the data effectively? A lower bound
perspective [58.71657561857055]
We show that semi-supervised learning algorithms can leverage unlabeled data to improve over the labeled sample complexity of supervised learning algorithms.
Our work suggests that, while proving performance gains for SSL algorithms is possible, it requires careful tracking of constants.
arXiv Detail & Related papers (2023-11-30T13:48:50Z) - Tackling Computational Heterogeneity in FL: A Few Theoretical Insights [68.8204255655161]
We introduce and analyse a novel aggregation framework that allows for formalizing and tackling computational heterogeneous data.
Proposed aggregation algorithms are extensively analyzed from a theoretical, and an experimental prospective.
arXiv Detail & Related papers (2023-07-12T16:28:21Z) - Federated Latent Class Regression for Hierarchical Data [5.110894308882439]
Federated Learning (FL) allows a number of agents to participate in training a global machine learning model without disclosing locally stored data.
We propose a novel probabilistic model, Hierarchical Latent Class Regression (HLCR), and its extension to Federated Learning, FEDHLCR.
Our inference algorithm, being derived from Bayesian theory, provides strong convergence guarantees and good robustness to overfitting. Experimental results show that FEDHLCR offers fast convergence even in non-IID datasets.
arXiv Detail & Related papers (2022-06-22T00:33:04Z) - Collaborative Intelligence Orchestration: Inconsistency-Based Fusion of
Semi-Supervised Learning and Active Learning [60.26659373318915]
Active learning (AL) and semi-supervised learning (SSL) are two effective, but often isolated, means to alleviate the data-hungry problem.
We propose an innovative Inconsistency-based virtual aDvErial algorithm to further investigate SSL-AL's potential superiority.
Two real-world case studies visualize the practical industrial value of applying and deploying the proposed data sampling algorithm.
arXiv Detail & Related papers (2022-06-07T13:28:43Z) - Server-Side Local Gradient Averaging and Learning Rate Acceleration for
Scalable Split Learning [82.06357027523262]
Federated learning (FL) and split learning (SL) are two spearheads possessing their pros and cons, and are suited for many user clients and large models.
In this work, we first identify the fundamental bottlenecks of SL, and thereby propose a scalable SL framework, coined SGLR.
arXiv Detail & Related papers (2021-12-11T08:33:25Z) - Self-supervised Learning is More Robust to Dataset Imbalance [65.84339596595383]
We investigate self-supervised learning under dataset imbalance.
Off-the-shelf self-supervised representations are already more robust to class imbalance than supervised representations.
We devise a re-weighted regularization technique that consistently improves the SSL representation quality on imbalanced datasets.
arXiv Detail & Related papers (2021-10-11T06:29:56Z) - Semi-Supervised Empirical Risk Minimization: Using unlabeled data to
improve prediction [4.860671253873579]
We present a general methodology for using unlabeled data to design semi supervised learning (SSL) variants of the Empirical Risk Minimization (ERM) learning process.
We analyze of the effectiveness of our SSL approach in improving prediction performance.
arXiv Detail & Related papers (2020-09-01T17:55:51Z) - Semi-supervised learning objectives as log-likelihoods in a generative
model of data curation [32.45282187405337]
We formulate SSL objectives as a log-likelihood in a generative model of data curation.
We give a proof-of-principle for Bayesian SSL on toy data.
arXiv Detail & Related papers (2020-08-13T13:50:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.