Accelerated Doubly Stochastic Gradient Algorithm for Large-scale
Empirical Risk Minimization
- URL: http://arxiv.org/abs/2304.11665v1
- Date: Sun, 23 Apr 2023 14:21:29 GMT
- Title: Accelerated Doubly Stochastic Gradient Algorithm for Large-scale
Empirical Risk Minimization
- Authors: Zebang Shen, Hui Qian, Tongzhou Mu, Chao Zhang
- Abstract summary: We propose a doubly algorithm with a novel accelerating multimomentum technique to solve large scale empirical risk minimization problem for learning tasks.
While enjoying a provably superior convergence rate, in each iteration, such algorithm only accesses a mini batch of samples and updates a small block of variable coordinates.
- Score: 23.271890743506514
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Nowadays, algorithms with fast convergence, small memory footprints, and low
per-iteration complexity are particularly favorable for artificial intelligence
applications. In this paper, we propose a doubly stochastic algorithm with a
novel accelerating multi-momentum technique to solve large scale empirical risk
minimization problem for learning tasks. While enjoying a provably superior
convergence rate, in each iteration, such algorithm only accesses a mini batch
of samples and meanwhile updates a small block of variable coordinates, which
substantially reduces the amount of memory reference when both the massive
sample size and ultra-high dimensionality are involved. Empirical studies on
huge scale datasets are conducted to illustrate the efficiency of our method in
practice.
Related papers
- Robust Clustering on High-Dimensional Data with Stochastic Quantization [0.0]
This paper addresses the limitations of conventional vector quantization algorithms.
It investigates the Quantization (SQ) as an alternative for high-dimensionality computation.
arXiv Detail & Related papers (2024-09-03T17:13:55Z) - Randomized Dimension Reduction with Statistical Guarantees [0.27195102129095]
This thesis explores some of such algorithms for fast execution and efficient data utilization.
We focus on learning algorithms with various incorporations of data augmentation that improve generalization and distributional provably.
Specifically, Chapter 4 presents a sample complexity analysis for data augmentation consistency regularization.
arXiv Detail & Related papers (2023-10-03T02:01:39Z) - Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data.
The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships.
A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z) - Representation Learning with Multi-Step Inverse Kinematics: An Efficient
and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity.
We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level.
Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z) - Learning Large Scale Sparse Models [6.428186644949941]
We consider learning sparse models in large scale settings, where the number of samples and the feature dimension can grow as large as millions or billions.
We propose to learn sparse models such as Lasso in an online manner where in each, only one randomly chosen sample is revealed to update a sparse gradient.
Thereby, the memory cost is independent of the sample size and gradient evaluation for one sample is efficient.
arXiv Detail & Related papers (2023-01-26T06:29:49Z) - Faster One-Sample Stochastic Conditional Gradient Method for Composite
Convex Minimization [61.26619639722804]
We propose a conditional gradient method (CGM) for minimizing convex finite-sum objectives formed as a sum of smooth and non-smooth terms.
The proposed method, equipped with an average gradient (SAG) estimator, requires only one sample per iteration. Nevertheless, it guarantees fast convergence rates on par with more sophisticated variance reduction techniques.
arXiv Detail & Related papers (2022-02-26T19:10:48Z) - SreaMRAK a Streaming Multi-Resolution Adaptive Kernel Algorithm [60.61943386819384]
Existing implementations of KRR require that all the data is stored in the main memory.
We propose StreaMRAK - a streaming version of KRR.
We present a showcase study on two synthetic problems and the prediction of the trajectory of a double pendulum.
arXiv Detail & Related papers (2021-08-23T21:03:09Z) - Stochastic Hard Thresholding Algorithms for AUC Maximization [49.00683387735522]
We develop a hard thresholding algorithm for AUC in distributiond classification.
We conduct experiments to show the efficiency and effectiveness of the proposed algorithms.
arXiv Detail & Related papers (2020-11-04T16:49:29Z) - Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth
Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step.
Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z) - SONIA: A Symmetric Blockwise Truncated Optimization Algorithm [2.9923891863939938]
This work presents a new algorithm for empirical risk.
The algorithm bridges the gap between first- and second-order search methods by computing a second-order search-type update in one subspace, coupled with a scaled steepest descent step in the Theoretical complement.
arXiv Detail & Related papers (2020-06-06T19:28:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.