Related papers: Oblivious Stochastic Composite Optimization

Oblivious Stochastic Composite Optimization

URL: http://arxiv.org/abs/2306.17470v2
Date: Tue, 07 Oct 2025 13:02:14 GMT
Title: Oblivious Stochastic Composite Optimization
Authors: Clément Lezane, Alexandre d'Aspremont,
Abstract summary: We show that our algorithms converge without any prior knowledge on the parameters of the problem.<n>All three algorithms work without prior knowledge of the diameter of the feasible set, the Lipschitz constant or smoothness of the objective function.<n>We extend our framework to relative scale and demonstrate the efficiency and robustness of our methods on large-scale semidefinite programs.
Score: 47.48197617884748
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In stochastic convex optimization problems, most existing adaptive methods rely on prior knowledge about the diameter bound $D$ when the smoothness or the Lipschitz constant is unknown. This often significantly affects performance as only a rough approximation of $D$ is usually known in practice. Here, we bypass this limitation by combining mirror descent with dual averaging techniques and we show that, under oblivious step-sizes regime, our algorithms converge without any prior knowledge on the parameters of the problem. We introduce three oblivious stochastic algorithms to address different settings. The first algorithm is designed for objectives in relative scale, the second one is an accelerated version tailored for smooth objectives, whereas the last one is for relatively-smooth objectives. All three algorithms work without prior knowledge of the diameter of the feasible set, the Lipschitz constant or smoothness of the objective function. We use these results to revisit the problem of solving large-scale semidefinite programs using randomized first-order methods and stochastic smoothing. We extend our framework to relative scale and demonstrate the efficiency and robustness of our methods on large-scale semidefinite programs.

Related papers

Stochastic Smoothed Primal-Dual Algorithms for Nonconvex Optimization with Linear Inequality Constraints [12.624604051853657]
We propose smoothed primal-dual algorithms for solving nonexact optimization problems with linear inequality constraints.<n>Our algorithms are single-loop iterations based on one gradient at each sample.<n>Unlike existing methods, our algorithms are free sub, large sizes or increasing parameters and use dual variable updates to ensure feasibility.
arXiv Detail & Related papers (2025-04-10T09:59:43Z)
Solving quadratic binary optimization problems using quantum SDP methods: Non-asymptotic running time analysis [1.9081120388919084]
Quantum computers can solve semidefinite programs (SDPs) using resources that scale better than state-of-the-art classical methods.<n>We present an analysis of the non-asymptotic resource requirements of a quantum SDP solver.
arXiv Detail & Related papers (2025-02-21T12:54:05Z)
Pathwise optimization for bridge-type estimators and its applications [49.1574468325115]
Pathwise methods allow to efficiently compute the full path for penalized estimators.<n>We apply these algorithms to the penalized estimation of processes observed at discrete times.
arXiv Detail & Related papers (2024-12-05T10:38:29Z)
Obtaining Lower Query Complexities through Lightweight Zeroth-Order Proximal Gradient Algorithms [65.42376001308064]
We propose two variance reduced ZO estimators for complex gradient problems. We improve the state-of-the-art function complexities from $mathcalOleft(minfracdn1/2epsilon2, fracdepsilon3right)$ to $tildecalOleft(fracdepsilon2right)$.
arXiv Detail & Related papers (2024-10-03T15:04:01Z)
Fast Minimization of Expected Logarithmic Loss via Stochastic Dual Averaging [8.990961435218544]
We propose a first-order algorithm named $B$-sample dual averaging with the logarithmic barrier. For the Poisson inverse problem, our algorithm attains an $varepsilon$ solution in $smashtildeO(d3/varepsilon2)$ time. When computing the maximum-likelihood estimate for quantum state tomography, our algorithm yields an $varepsilon$-optimal solution in $smashtildeO(d3/varepsilon2)$ time.
arXiv Detail & Related papers (2023-11-05T03:33:44Z)
An Algorithm with Optimal Dimension-Dependence for Zero-Order Nonsmooth Nonconvex Stochastic Optimization [37.300102993926046]
We study the complexity of producing neither smooth nor convex points of Lipschitz objectives which are possibly using only zero-order evaluations. Our analysis is based on a simple yet powerful. Goldstein-subdifferential set, which allows recent advancements in. nonsmooth non optimization.
arXiv Detail & Related papers (2023-07-10T11:56:04Z)
A One-Sample Decentralized Proximal Algorithm for Non-Convex Stochastic Composite Optimization [10.762749887051546]
We propose two-time scale algorithms: ProxDAS-A and Proxcal$DASA-GT. Unlike prior work, our algorithms achieve comparable complexity without requiring large batch sizes, more complex per-it operations, or stronger assumptions.
arXiv Detail & Related papers (2023-02-20T05:16:18Z)
Randomized Block-Coordinate Optimistic Gradient Algorithms for Root-Finding Problems [11.15373699918747]
We develop two new algorithms to approximate a solution of nonlinear equations in large-scale settings. We apply our methods to a class of large-scale finite-sum inclusions, which covers prominent applications in machine learning, statistical learning, and network optimization.
arXiv Detail & Related papers (2023-01-08T21:46:27Z)
ReSQueing Parallel and Private Stochastic Convex Optimization [59.53297063174519]
We introduce a new tool for BFG convex optimization (SCO): a Reweighted Query (ReSQue) estimator for the gradient of a function convolved with a (Gaussian) probability density. We develop algorithms achieving state-of-the-art complexities for SCO in parallel and private settings.
arXiv Detail & Related papers (2023-01-01T18:51:29Z)
Extra-Newton: A First Approach to Noise-Adaptive Accelerated Second-Order Methods [57.050204432302195]
This work proposes a universal and adaptive second-order method for minimizing second-order smooth, convex functions. Our algorithm achieves $O(sigma / sqrtT)$ convergence when the oracle feedback is with variance $sigma2$, and improves its convergence to $O( 1 / T3)$ with deterministic oracles.
arXiv Detail & Related papers (2022-11-03T14:12:51Z)
An Optimal Stochastic Algorithm for Decentralized Nonconvex Finite-sum Optimization [25.21457349137344]
We show a proof to show DEAREST requires at most $mathcal O(+sqrtmnLvarepsilon-2)$ first-order oracle (IFO) calls and $mathcal O(Lvarepsilon-2/sqrt1-lambda_W)$ communication rounds.
arXiv Detail & Related papers (2022-10-25T11:37:11Z)
Optimal Gradient Sliding and its Application to Distributed Optimization Under Similarity [121.83085611327654]
We structured convex optimization problems with additive objective $r:=p + q$, where $r$ is $mu$-strong convex similarity. We proposed a method to solve problems master to agents' communication and local calls. The proposed method is much sharper than the $mathcalO(sqrtL_q/mu)$ method.
arXiv Detail & Related papers (2022-05-30T14:28:02Z)
Lifted Primal-Dual Method for Bilinearly Coupled Smooth Minimax Optimization [47.27237492375659]
We study the bilinearly coupled minimax problem: $min_x max_y f(x) + ytop A x - h(y)$, where $f$ and $h$ are both strongly convex smooth functions. No known first-order algorithms have hitherto achieved the lower complexity bound of $Omega(sqrtfracL_xmu_x + frac|A|sqrtmu_x,mu_y) log(frac1vareps
arXiv Detail & Related papers (2022-01-19T05:56:19Z)
STORM+: Fully Adaptive SGD with Momentum for Nonconvex Optimization [74.1615979057429]
We investigate non-batch optimization problems where the objective is an expectation over smooth loss functions. Our work builds on the STORM algorithm, in conjunction with a novel approach to adaptively set the learning rate and momentum parameters.
arXiv Detail & Related papers (2021-11-01T15:43:36Z)
Asynchronous Stochastic Optimization Robust to Arbitrary Delays [54.61797739710608]
We consider optimization with delayed gradients where, at each time stept$, the algorithm makes an update using a stale computation - d_t$ for arbitrary delay $d_t gradient. Our experiments demonstrate the efficacy and robustness of our algorithm in cases where the delay distribution is skewed or heavy-tailed.
arXiv Detail & Related papers (2021-06-22T15:50:45Z)
Correcting Momentum with Second-order Information [50.992629498861724]
We develop a new algorithm for non-critical optimization that finds an $O(epsilon)$epsilon point in the optimal product. We validate our results on a variety of large-scale deep learning benchmarks and architectures.
arXiv Detail & Related papers (2021-03-04T19:01:20Z)
Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step. Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z)
Private Stochastic Convex Optimization: Optimal Rates in Linear Time [74.47681868973598]
We study the problem of minimizing the population loss given i.i.d. samples from a distribution over convex loss functions. A recent work of Bassily et al. has established the optimal bound on the excess population loss achievable given $n$ samples. We describe two new techniques for deriving convex optimization algorithms both achieving the optimal bound on excess loss and using $O(minn, n2/d)$ gradient computations.
arXiv Detail & Related papers (2020-05-10T19:52:03Z)
Maximizing Determinants under Matroid Constraints [69.25768526213689]
We study the problem of finding a basis $S$ of $M$ such that $det(sum_i in Sv_i v_i v_itop)$ is maximized. This problem appears in a diverse set of areas such as experimental design, fair allocation of goods, network design, and machine learning.
arXiv Detail & Related papers (2020-04-16T19:16:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.