Fully Stochastic Trust-Region Sequential Quadratic Programming for
Equality-Constrained Optimization Problems
- URL: http://arxiv.org/abs/2211.15943v2
- Date: Mon, 29 Jan 2024 02:02:20 GMT
- Title: Fully Stochastic Trust-Region Sequential Quadratic Programming for
Equality-Constrained Optimization Problems
- Authors: Yuchen Fang, Sen Na, Michael W. Mahoney, Mladen Kolar
- Abstract summary: We propose a sequential quadratic programming algorithm (TR-StoSQP) to solve nonlinear optimization problems with objectives and deterministic equality constraints.
The algorithm adaptively selects the trust-region radius and, compared to the existing line-search StoSQP schemes, allows us to utilize indefinite Hessian matrices.
- Score: 62.83783246648714
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a trust-region stochastic sequential quadratic programming
algorithm (TR-StoSQP) to solve nonlinear optimization problems with stochastic
objectives and deterministic equality constraints. We consider a fully
stochastic setting, where at each step a single sample is generated to estimate
the objective gradient. The algorithm adaptively selects the trust-region
radius and, compared to the existing line-search StoSQP schemes, allows us to
utilize indefinite Hessian matrices (i.e., Hessians without modification) in
SQP subproblems. As a trust-region method for constrained optimization, our
algorithm must address an infeasibility issue -- the linearized equality
constraints and trust-region constraints may lead to infeasible SQP
subproblems. In this regard, we propose an adaptive relaxation technique to
compute the trial step, consisting of a normal step and a tangential step. To
control the lengths of these two steps while ensuring a scale-invariant
property, we adaptively decompose the trust-region radius into two segments,
based on the proportions of the rescaled feasibility and optimality residuals
to the rescaled full KKT residual. The normal step has a closed form, while the
tangential step is obtained by solving a trust-region subproblem, to which a
solution ensuring the Cauchy reduction is sufficient for our study. We
establish a global almost sure convergence guarantee for TR-StoSQP, and
illustrate its empirical performance on both a subset of problems in the CUTEst
test set and constrained logistic regression problems using data from the
LIBSVM collection.
Related papers
- Randomized algorithms and PAC bounds for inverse reinforcement learning in continuous spaces [47.907236421762626]
This work studies discrete-time discounted Markov decision processes with continuous state and action spaces.
We first consider the case in which we have access to the entire expert policy and characterize the set of solutions to the inverse problem.
arXiv Detail & Related papers (2024-05-24T12:53:07Z) - FastPart: Over-Parameterized Stochastic Gradient Descent for Sparse
optimisation on Measures [1.9950682531209156]
This paper presents a novel algorithm that leverages Gradient Descent strategies in conjunction with Random Features to augment the scalability of Conic Particle Gradient Descent (CPGD)
We provide rigorous proofs demonstrating the following key findings: (i) The total variation norms of the solution measures along the descent trajectory remain bounded, ensuring stability and preventing undesirable divergence; (ii) We establish a global convergence guarantee with a convergence rate of $mathcalO(log(K)/sqrtK)$ over $K$, showcasing the efficiency and effectiveness of our algorithm; (iii) Additionally, we analyze and establish
arXiv Detail & Related papers (2023-12-10T20:41:43Z) - High-probability sample complexities for policy evaluation with linear function approximation [88.87036653258977]
We investigate the sample complexities required to guarantee a predefined estimation error of the best linear coefficients for two widely-used policy evaluation algorithms.
We establish the first sample complexity bound with high-probability convergence guarantee that attains the optimal dependence on the tolerance level.
arXiv Detail & Related papers (2023-05-30T12:58:39Z) - Inequality Constrained Stochastic Nonlinear Optimization via Active-Set
Sequential Quadratic Programming [17.9230793188835]
We study nonlinear optimization problems with objective and deterministic equality and inequality constraints.
We propose an active-set sequentialAdaptive programming algorithm, using a differentiable exact augmented Lagrangian as the merit function.
The algorithm adaptively selects the parameters of augmented Lagrangian and performs line search to decide the stepsize.
arXiv Detail & Related papers (2021-09-23T17:12:17Z) - Optimal Rates for Random Order Online Optimization [60.011653053877126]
We study the citetgarber 2020online, where the loss functions may be chosen by an adversary, but are then presented online in a uniformly random order.
We show that citetgarber 2020online algorithms achieve the optimal bounds and significantly improve their stability.
arXiv Detail & Related papers (2021-06-29T09:48:46Z) - Near-Optimal High Probability Complexity Bounds for Non-Smooth
Stochastic Optimization with Heavy-Tailed Noise [63.304196997102494]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity with bounds dependence on the confidence level that is either negative-power or logarithmic.
We propose novel stepsize rules for two gradient methods with clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - An Adaptive Stochastic Sequential Quadratic Programming with
Differentiable Exact Augmented Lagrangians [17.9230793188835]
We consider the problem of solving nonlinear optimization programs with objective and deterministic equality.
We propose a sequential quadratic programming (SQP) that uses a differentiable exact augmented Lagrangian as the merit function.
The proposed algorithm is the first SQP that allows a line search procedure and the first line search procedure.
arXiv Detail & Related papers (2021-02-10T08:40:55Z) - Large-Scale Methods for Distributionally Robust Optimization [53.98643772533416]
We prove that our algorithms require a number of evaluations gradient independent of training set size and number of parameters.
Experiments on MNIST and ImageNet confirm the theoretical scaling of our algorithms, which are 9--36 times more efficient than full-batch methods.
arXiv Detail & Related papers (2020-10-12T17:41:44Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.