A Unified Analysis of Stochastic Gradient Descent with Arbitrary Data Permutations and Beyond
- URL: http://arxiv.org/abs/2501.16117v1
- Date: Mon, 27 Jan 2025 15:07:02 GMT
- Title: A Unified Analysis of Stochastic Gradient Descent with Arbitrary Data Permutations and Beyond
- Authors: Yipeng Li, Xinchen Lyu, Zhenyu Liu,
- Abstract summary: We aim to provide a unified convergence analysis for permutation-based Gradient Descent (SGD)
We categorize existing permutation-based SGD algorithms into four categories: Arbitrary Permutations, Independent Permutations, One Permutation and Dependent Permutations.
We develop a unified framework for permutation-based SGD with arbitrary permutations of examples, incorporating all the aforementioned representative algorithms.
- Score: 7.325166704015515
- License:
- Abstract: We aim to provide a unified convergence analysis for permutation-based Stochastic Gradient Descent (SGD), where data examples are permuted before each epoch. By examining the relations among permutations, we categorize existing permutation-based SGD algorithms into four categories: Arbitrary Permutations, Independent Permutations (including Random Reshuffling), One Permutation (including Incremental Gradient, Shuffle One and Nice Permutation) and Dependent Permutations (including GraBs Lu et al., 2022; Cooper et al., 2023). Existing unified analyses failed to encompass the Dependent Permutations category due to the inter-epoch dependencies in its permutations. In this work, we propose a general assumption that captures the inter-epoch permutation dependencies. Using the general assumption, we develop a unified framework for permutation-based SGD with arbitrary permutations of examples, incorporating all the aforementioned representative algorithms. Furthermore, we adapt our framework on example ordering in SGD for client ordering in Federated Learning (FL). Specifically, we develop a unified framework for regularized-participation FL with arbitrary permutations of clients.
Related papers
- Random sampling of permutations through quantum circuits [0.0]
We introduce classical algorithms for random sampling of permutations, drawing inspiration from the Steinhaus-Johnson-Trotter algorithm.
We develop a quantum analogue of these classical algorithms using a quantum circuit model for random sampling of permutations for $n$-qubit systems.
arXiv Detail & Related papers (2024-09-04T18:19:30Z) - Permutation Superposition Oracles for Quantum Query Lower Bounds [14.957648729581502]
We show how to use the resulting oracle simulation to bound the success probability of an algorithm for any predicate on input-output pairs.
We show that the one-round sponge construction is unconditionally preimage resistant in the random permutation model.
arXiv Detail & Related papers (2024-07-12T19:27:13Z) - Compositional Generalization without Trees using Multiset Tagging and
Latent Permutations [121.37328648951993]
We phrase semantic parsing as a two-step process: we first tag each input token with a multiset of output tokens.
Then we arrange the tokens into an output sequence using a new way of parameterizing and predicting permutations.
Our model outperforms pretrained seq2seq models and prior work on realistic semantic parsing tasks.
arXiv Detail & Related papers (2023-05-26T14:09:35Z) - HiPerformer: Hierarchically Permutation-Equivariant Transformer for Time
Series Forecasting [56.95572957863576]
We propose a hierarchically permutation-equivariant model that considers both the relationship among components in the same group and the relationship among groups.
The experiments conducted on real-world data demonstrate that the proposed method outperforms existing state-of-the-art methods.
arXiv Detail & Related papers (2023-05-14T05:11:52Z) - Repeatable Random Permutation Set [9.327920030279586]
We propose random permutation set ($rm R2PS$) which takes the repetition of items into consideration.
Based on these properties, a decision support system application is simulated to show the effectiveness of $rm R2PS$.
arXiv Detail & Related papers (2022-11-03T09:44:43Z) - On Fitness Landscape Analysis of Permutation Problems: From Distance
Metrics to Mutation Operator Selection [0.0]
We explore the theory and expand upon the practice of fitness landscape analysis for optimization problems over the space of permutations.
We begin with a survey of the available distance metrics for permutations, and then use principal component analysis to classify these metrics.
Our implementations of the permutation metrics, permutation mutation operators, and associated evolutionary algorithm, are available in a pair of open source Java libraries.
arXiv Detail & Related papers (2022-08-23T20:46:49Z) - Probabilistic Permutation Graph Search: Black-Box Optimization for
Fairness in Ranking [53.94413894017409]
We present a novel way of representing permutation distributions, based on the notion of permutation graphs.
Similar to PL, our distribution representation, called PPG, can be used for black-box optimization of fairness.
arXiv Detail & Related papers (2022-04-28T20:38:34Z) - Sparse Quadratic Optimisation over the Stiefel Manifold with Application
to Permutation Synchronisation [71.27989298860481]
We address the non- optimisation problem of finding a matrix on the Stiefel manifold that maximises a quadratic objective function.
We propose a simple yet effective sparsity-promoting algorithm for finding the dominant eigenspace matrix.
arXiv Detail & Related papers (2021-09-30T19:17:35Z) - Batch Bayesian Optimization on Permutations using Acquisition Weighted
Kernels [86.11176756341114]
We introduce LAW, a new efficient batch acquisition method based on the determinantal point process.
We provide a regret analysis for our method to gain insight in its theoretical properties.
We evaluate the method on several standard problems involving permutations such as quadratic assignment.
arXiv Detail & Related papers (2021-02-26T10:15:57Z) - Regularizing Towards Permutation Invariance in Recurrent Models [26.36835670113303]
We show that RNNs can be regularized towards permutation invariance, and that this can result in compact models.
Existing solutions mostly suggest restricting the learning problem to hypothesis classes which are permutation invariant by design.
We show that our method outperforms other permutation invariant approaches on synthetic and real world datasets.
arXiv Detail & Related papers (2020-10-25T07:46:51Z) - Stochastic Saddle-Point Optimization for Wasserstein Barycenters [69.68068088508505]
We consider the populationimation barycenter problem for random probability measures supported on a finite set of points and generated by an online stream of data.
We employ the structure of the problem and obtain a convex-concave saddle-point reformulation of this problem.
In the setting when the distribution of random probability measures is discrete, we propose an optimization algorithm and estimate its complexity.
arXiv Detail & Related papers (2020-06-11T19:40:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.