The Sample Complexity of Parameter-Free Stochastic Convex Optimization
- URL: http://arxiv.org/abs/2506.11336v1
- Date: Thu, 12 Jun 2025 22:14:49 GMT
- Title: The Sample Complexity of Parameter-Free Stochastic Convex Optimization
- Authors: Jared Lawrence, Ari Kalinsky, Hannah Bradfield, Yair Carmon, Oliver Hinder,
- Abstract summary: We study the sample complexity of convex optimization when problem parameters, e.g., the distance to optimality, are unknown.<n>First, we develop a reliable model selection method that avoids overfitting the validation set.<n>Second, we develop a regularization-based method that is specialized to the case that only the distance to optimality is unknown.
- Score: 18.050176726484974
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the sample complexity of stochastic convex optimization when problem parameters, e.g., the distance to optimality, are unknown. We pursue two strategies. First, we develop a reliable model selection method that avoids overfitting the validation set. This method allows us to generically tune the learning rate of stochastic optimization methods to match the optimal known-parameter sample complexity up to $\log\log$ factors. Second, we develop a regularization-based method that is specialized to the case that only the distance to optimality is unknown. This method provides perfect adaptability to unknown distance to optimality, demonstrating a separation between the sample and computational complexity of parameter-free stochastic convex optimization. Combining these two methods allows us to simultaneously adapt to multiple problem structures. Experiments performing few-shot learning on CIFAR-10 by fine-tuning CLIP models and prompt engineering Gemini to count shapes indicate that our reliable model selection method can help mitigate overfitting to small validation sets.
Related papers
- Proximal Oracles for Optimization and Sampling [18.77973093341588]
We consider convex optimization with non-smooth objective function and log-concave sampling with non-smooth potential.<n>To overcome the challenges caused by non-smoothness, our algorithms employ two powerful proximal frameworks in optimization and sampling.
arXiv Detail & Related papers (2024-04-02T18:52:28Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - Minibatch and Momentum Model-based Methods for Stochastic Non-smooth
Non-convex Optimization [3.4809730725241597]
We make two important extensions to model-based methods.
First, we propose a new minibatch which takes a set of samples to approximate the model function in each iteration.
Second, by the success of momentum techniques we propose a new convex-based model.
arXiv Detail & Related papers (2021-06-06T05:31:57Z) - Slowly Varying Regression under Sparsity [5.22980614912553]
We present the framework of slowly hyper regression under sparsity, allowing regression models to exhibit slow and sparse variations.
We suggest a procedure that reformulates as a binary convex algorithm.
We show that the resulting model outperforms competing formulations in comparable times across various datasets.
arXiv Detail & Related papers (2021-02-22T04:51:44Z) - Adaptive Sequential SAA for Solving Two-stage Stochastic Linear Programs [1.6181085766811525]
We present adaptive sequential SAA (sample average approximation) algorithms to solve large-scale two-stage linear programs.
The proposed algorithm can be stopped in finite-time to return a solution endowed with a probabilistic guarantee on quality.
arXiv Detail & Related papers (2020-12-07T14:58:16Z) - Adaptive Sampling for Best Policy Identification in Markov Decision
Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model.
The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z) - Robust, Accurate Stochastic Optimization for Variational Inference [68.83746081733464]
We show that common optimization methods lead to poor variational approximations if the problem is moderately large.
Motivated by these findings, we develop a more robust and accurate optimization framework by viewing the underlying algorithm as producing a Markov chain.
arXiv Detail & Related papers (2020-09-01T19:12:11Z) - Learning to Guide Random Search [111.71167792453473]
We consider derivative-free optimization of a high-dimensional function that lies on a latent low-dimensional manifold.
We develop an online learning approach that learns this manifold while performing the optimization.
We empirically evaluate the method on continuous optimization benchmarks and high-dimensional continuous control problems.
arXiv Detail & Related papers (2020-04-25T19:21:14Z) - MINA: Convex Mixed-Integer Programming for Non-Rigid Shape Alignment [77.38594866794429]
convex mixed-integer programming formulation for non-rigid shape matching.
We propose a novel shape deformation model based on an efficient low-dimensional discrete model.
arXiv Detail & Related papers (2020-02-28T09:54:06Z) - Distributed Averaging Methods for Randomized Second Order Optimization [54.51566432934556]
We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a bottleneck.
We develop unbiased parameter averaging methods for randomized second order optimization that employ sampling and sketching of the Hessian.
We also extend the framework of second order averaging methods to introduce an unbiased distributed optimization framework for heterogeneous computing systems.
arXiv Detail & Related papers (2020-02-16T09:01:18Z) - Robust Learning Rate Selection for Stochastic Optimization via Splitting
Diagnostic [5.395127324484869]
SplitSGD is a new dynamic learning schedule for optimization.
The method decreases the learning rate for better adaptation to the local geometry of the objective function.
It essentially does not incur additional computational cost than standard SGD.
arXiv Detail & Related papers (2019-10-18T19:38:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.