Related papers: Closing the Computational-Query Depth Gap in Parallel Stochastic Convex Optimization

Closing the Computational-Query Depth Gap in Parallel Stochastic Convex Optimization

URL: http://arxiv.org/abs/2406.07373v1
Date: Tue, 11 Jun 2024 15:41:48 GMT
Title: Closing the Computational-Query Depth Gap in Parallel Stochastic Convex Optimization
Authors: Arun Jambulapati, Aaron Sidford, Kevin Tian,
Abstract summary: We develop a new parallel algorithm for minimizing Lipschitz, convex functions with a subgradient oracle. Our result closes a gap between the best-known query depth and the best-known computational depth of parallel algorithms.
Score: 26.36906884097317
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We develop a new parallel algorithm for minimizing Lipschitz, convex functions with a stochastic subgradient oracle. The total number of queries made and the query depth, i.e., the number of parallel rounds of queries, match the prior state-of-the-art, [CJJLLST23], while improving upon the computational depth by a polynomial factor for sufficiently small accuracy. When combined with previous state-of-the-art methods our result closes a gap between the best-known query depth and the best-known computational depth of parallel algorithms. Our method starts with a ball acceleration framework of previous parallel methods, i.e., [CJJJLST20, ACJJS21], which reduce the problem to minimizing a regularized Gaussian convolution of the function constrained to Euclidean balls. By developing and leveraging new stability properties of the Hessian of this induced function, we depart from prior parallel algorithms and reduce these ball-constrained optimization problems to stochastic unconstrained quadratic minimization problems. Although we are unable to prove concentration of the asymmetric matrices that we use to approximate this Hessian, we nevertheless develop an efficient parallel method for solving these quadratics. Interestingly, our algorithms can be improved using fast matrix multiplication and use nearly-linear work if the matrix multiplication exponent is 2.

Related papers

Inertial Quadratic Majorization Minimization with Application to Kernel Regularized Learning [1.0282274843007797]
We introduce the Quadratic Majorization Minimization with Extrapolation (QMME) framework and establish its sequential convergence properties.<n>To demonstrate practical advantages, we apply QMME to large-scale kernel regularized learning problems.
arXiv Detail & Related papers (2025-07-06T05:17:28Z)
Stochastic Optimization for Non-convex Problem with Inexact Hessian Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties. We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z)
Accelerating Cutting-Plane Algorithms via Reinforcement Learning Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms. Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability. We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z)
Fast Projected Newton-like Method for Precision Matrix Estimation under Total Positivity [15.023842222803058]
Current algorithms are designed using the block coordinate descent method or the proximal point algorithm. We propose a novel algorithm based on the two-metric projection method, incorporating a carefully designed search direction and variable partitioning scheme. Experimental results on synthetic and real-world datasets demonstrate that our proposed algorithm provides a significant improvement in computational efficiency compared to the state-of-the-art methods.
arXiv Detail & Related papers (2021-12-03T14:39:10Z)
Continuation Newton methods with deflation techniques for global optimization problems [3.705839280172101]
A global minimum point of an optimization problem is of interest in engineering. In this article, we consider a new memetic algorithm for this nonlinear largescale problem. According to our numerical experiments, new algorithm works well for unconstrained unconstrained problems.
arXiv Detail & Related papers (2021-07-29T09:53:49Z)
Hybrid Trilinear and Bilinear Programming for Aligning Partially Overlapping Point Sets [85.71360365315128]
In many applications, we need algorithms which can align partially overlapping point sets are invariant to the corresponding corresponding RPM algorithm. We first show that the objective is a cubic bound function. We then utilize the convex envelopes of trilinear and bilinear monomial transformations to derive its lower bound. We next develop a branch-and-bound (BnB) algorithm which only branches over the transformation variables and runs efficiently.
arXiv Detail & Related papers (2021-01-19T04:24:23Z)
Provably Convergent Working Set Algorithm for Non-Convex Regularized Regression [0.0]
This paper proposes a working set algorithm for non-regular regularizers with convergence guarantees. Our results demonstrate high gain compared to the full problem solver for both block-coordinates or a gradient solver.
arXiv Detail & Related papers (2020-06-24T07:40:31Z)
Exploiting Higher Order Smoothness in Derivative-free Optimization and Continuous Bandits [99.70167985955352]
We study the problem of zero-order optimization of a strongly convex function. We consider a randomized approximation of the projected gradient descent algorithm. Our results imply that the zero-order algorithm is nearly optimal in terms of sample complexity and the problem parameters.
arXiv Detail & Related papers (2020-06-14T10:42:23Z)
Effective Dimension Adaptive Sketching Methods for Faster Regularized Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching. We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z)
Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems. We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms. Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.