Makespan Minimization in Split Learning: From Theory to Practice
- URL: http://arxiv.org/abs/2602.06693v1
- Date: Fri, 06 Feb 2026 13:22:44 GMT
- Title: Makespan Minimization in Split Learning: From Theory to Practice
- Authors: Robert Ganian, Fionn Mc Inerney, Dimitra Tsigkari,
- Abstract summary: Split learning is a solution for distributed machine learning with heterogeneous IoT devices.<n>We show how to minimize the training time by jointly devising the client-helper assignment and the schedule of tasks at the helpers.
- Score: 19.128619079473413
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Split learning recently emerged as a solution for distributed machine learning with heterogeneous IoT devices, where clients can offload part of their training to computationally-powerful helpers. The core challenge in split learning is to minimize the training time by jointly devising the client-helper assignment and the schedule of tasks at the helpers. We first study the model where each helper has a memory cardinality constraint on how many clients it may be assigned, which represents the case of homogeneous tasks. Through complexity theory, we rule out exact polynomial-time algorithms and approximation schemes even for highly restricted instances of this problem. We complement these negative results with a non-trivial polynomial-time 5-approximation algorithm. Building on this, we then focus on the more general heterogeneous task setting considered by Tirana et al. [INFOCOM 2024], where helpers have memory capacity constraints and clients have variable memory costs. In this case, we prove that, unless P=NP, the problem cannot admit a polynomial-time approximation algorithm for any approximation factor. However, by adapting our aforementioned 5-approximation algorithm, we develop a novel heuristic for the heterogeneous task setting and show that it outperforms heuristics from prior works through extensive experiments.
Related papers
- Learning Intersections of Two Margin Halfspaces under Factorizable Distributions [56.51474048985742]
Learning intersections of halfspaces is a central problem in Computational Learning Theory.<n>Even for just two halfspaces, it remains a major open question whether learning is possible in time.<n>We introduce a novel algorithm that provably circumvents the CSQ hardness barrier.
arXiv Detail & Related papers (2025-11-13T00:28:24Z) - Single-loop Algorithms for Stochastic Non-convex Optimization with Weakly-Convex Constraints [49.76332265680669]
This paper examines a crucial subset of problems where both the objective and constraint functions are weakly convex.<n>Existing methods often face limitations, including slow convergence rates or reliance on double-loop designs.<n>We introduce a novel single-loop penalty-based algorithm to overcome these challenges.
arXiv Detail & Related papers (2025-04-21T17:15:48Z) - Nonconvex Federated Learning on Compact Smooth Submanifolds With Heterogeneous Data [23.661713049508375]
We propose an algorithm that learns over a submanifold in the setting of a client.
We show that our proposed algorithm converges sub-ly to a neighborhood of a first-order optimal solution by using a novel analysis.
arXiv Detail & Related papers (2024-06-12T17:53:28Z) - DNCs Require More Planning Steps [7.837209773889032]
We investigate the effect of computational time and memory on generalization of implicit algorithmic solvers.
We show how the planning budget can drastically change the behavior of the learned algorithm.
arXiv Detail & Related papers (2024-06-04T10:31:03Z) - Limited Memory Online Gradient Descent for Kernelized Pairwise Learning
with Dynamic Averaging [18.843097436906618]
We introduce a lightweight OGD algorithm that does not require the independence of examples and generalizes to kernel pairwise learning.
Our algorithm builds the gradient based on a random example and a moving average representing the past data, which results in a sub-linear regret bound with a complexity of $O(T)$.
Several experiments with real-world datasets show that the complexity technique outperforms kernel and linear gradient in offline and online scenarios.
arXiv Detail & Related papers (2024-02-02T05:21:50Z) - Workflow Optimization for Parallel Split Learning [12.554265727169742]
Split learning (SL) has been proposed as a way to enable resource-constrained devices to train neural networks (NNs) and participate in federated learning (FL)
In parallel SL, multiple helpers can process model parts of one or more clients, thus, considerably reducing the maximum training time over all clients (makespan)
We propose a solution method based on the decomposition of the problem by leveraging its inherent symmetry, and a second one that is fully scalable.
arXiv Detail & Related papers (2024-02-01T14:16:10Z) - Factorization of Multi-Agent Sampling-Based Motion Planning [72.42734061131569]
Modern robotics often involves multiple embodied agents operating within a shared environment.
Standard sampling-based algorithms can be used to search for solutions in the robots' joint space.
We integrate the concept of factorization into sampling-based algorithms, which requires only minimal modifications to existing methods.
We present a general implementation of a factorized SBA, derive an analytical gain in terms of sample complexity for PRM*, and showcase empirical results for RRG.
arXiv Detail & Related papers (2023-04-01T15:50:18Z) - Optimal Algorithms for Stochastic Complementary Composite Minimization [55.26935605535377]
Inspired by regularization techniques in statistics and machine learning, we study complementary composite minimization.
We provide novel excess risk bounds, both in expectation and with high probability.
Our algorithms are nearly optimal, which we prove via novel lower complexity bounds for this class of problems.
arXiv Detail & Related papers (2022-11-03T12:40:24Z) - Stabilizing Q-learning with Linear Architectures for Provably Efficient
Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation.
We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z) - Multi-task Supervised Learning via Cross-learning [102.64082402388192]
We consider a problem known as multi-task learning, consisting of fitting a set of regression functions intended for solving different tasks.
In our novel formulation, we couple the parameters of these functions, so that they learn in their task specific domains while staying close to each other.
This facilitates cross-fertilization in which data collected across different domains help improving the learning performance at each other task.
arXiv Detail & Related papers (2020-10-24T21:35:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.