Fully Stochastic Trust-Region Sequential Quadratic Programming for
Equality-Constrained Optimization Problems
- URL: http://arxiv.org/abs/2211.15943v2
- Date: Mon, 29 Jan 2024 02:02:20 GMT
- Title: Fully Stochastic Trust-Region Sequential Quadratic Programming for
Equality-Constrained Optimization Problems
- Authors: Yuchen Fang, Sen Na, Michael W. Mahoney, Mladen Kolar
- Abstract summary: We propose a sequential quadratic programming algorithm (TR-StoSQP) to solve nonlinear optimization problems with objectives and deterministic equality constraints.
The algorithm adaptively selects the trust-region radius and, compared to the existing line-search StoSQP schemes, allows us to utilize indefinite Hessian matrices.
- Score: 62.83783246648714
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a trust-region stochastic sequential quadratic programming
algorithm (TR-StoSQP) to solve nonlinear optimization problems with stochastic
objectives and deterministic equality constraints. We consider a fully
stochastic setting, where at each step a single sample is generated to estimate
the objective gradient. The algorithm adaptively selects the trust-region
radius and, compared to the existing line-search StoSQP schemes, allows us to
utilize indefinite Hessian matrices (i.e., Hessians without modification) in
SQP subproblems. As a trust-region method for constrained optimization, our
algorithm must address an infeasibility issue -- the linearized equality
constraints and trust-region constraints may lead to infeasible SQP
subproblems. In this regard, we propose an adaptive relaxation technique to
compute the trial step, consisting of a normal step and a tangential step. To
control the lengths of these two steps while ensuring a scale-invariant
property, we adaptively decompose the trust-region radius into two segments,
based on the proportions of the rescaled feasibility and optimality residuals
to the rescaled full KKT residual. The normal step has a closed form, while the
tangential step is obtained by solving a trust-region subproblem, to which a
solution ensuring the Cauchy reduction is sufficient for our study. We
establish a global almost sure convergence guarantee for TR-StoSQP, and
illustrate its empirical performance on both a subset of problems in the CUTEst
test set and constrained logistic regression problems using data from the
LIBSVM collection.
Related papers
- Trust-Region Sequential Quadratic Programming for Stochastic Optimization with Random Models [57.52124921268249]
We propose a Trust Sequential Quadratic Programming method to find both first and second-order stationary points.
To converge to first-order stationary points, our method computes a gradient step in each iteration defined by minimizing a approximation of the objective subject.
To converge to second-order stationary points, our method additionally computes an eigen step to explore the negative curvature the reduced Hessian matrix.
arXiv Detail & Related papers (2024-09-24T04:39:47Z) - FastPart: Over-Parameterized Stochastic Gradient Descent for Sparse
optimisation on Measures [1.9950682531209156]
This paper presents a novel algorithm that leverages Gradient Descent strategies in conjunction with Random Features to augment the scalability of Conic Particle Gradient Descent (CPGD)
We provide rigorous proofs demonstrating the following key findings: (i) The total variation norms of the solution measures along the descent trajectory remain bounded, ensuring stability and preventing undesirable divergence; (ii) We establish a global convergence guarantee with a convergence rate of $mathcalO(log(K)/sqrtK)$ over $K$, showcasing the efficiency and effectiveness of our algorithm; (iii) Additionally, we analyze and establish
arXiv Detail & Related papers (2023-12-10T20:41:43Z) - High-probability sample complexities for policy evaluation with linear function approximation [88.87036653258977]
We investigate the sample complexities required to guarantee a predefined estimation error of the best linear coefficients for two widely-used policy evaluation algorithms.
We establish the first sample complexity bound with high-probability convergence guarantee that attains the optimal dependence on the tolerance level.
arXiv Detail & Related papers (2023-05-30T12:58:39Z) - Inequality Constrained Stochastic Nonlinear Optimization via Active-Set
Sequential Quadratic Programming [17.9230793188835]
We study nonlinear optimization problems with objective and deterministic equality and inequality constraints.
We propose an active-set sequentialAdaptive programming algorithm, using a differentiable exact augmented Lagrangian as the merit function.
The algorithm adaptively selects the parameters of augmented Lagrangian and performs line search to decide the stepsize.
arXiv Detail & Related papers (2021-09-23T17:12:17Z) - Optimal Rates for Random Order Online Optimization [60.011653053877126]
We study the citetgarber 2020online, where the loss functions may be chosen by an adversary, but are then presented online in a uniformly random order.
We show that citetgarber 2020online algorithms achieve the optimal bounds and significantly improve their stability.
arXiv Detail & Related papers (2021-06-29T09:48:46Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - An Adaptive Stochastic Sequential Quadratic Programming with
Differentiable Exact Augmented Lagrangians [17.9230793188835]
We consider the problem of solving nonlinear optimization programs with objective and deterministic equality.
We propose a sequential quadratic programming (SQP) that uses a differentiable exact augmented Lagrangian as the merit function.
The proposed algorithm is the first SQP that allows a line search procedure and the first line search procedure.
arXiv Detail & Related papers (2021-02-10T08:40:55Z) - Large-Scale Methods for Distributionally Robust Optimization [53.98643772533416]
We prove that our algorithms require a number of evaluations gradient independent of training set size and number of parameters.
Experiments on MNIST and ImageNet confirm the theoretical scaling of our algorithms, which are 9--36 times more efficient than full-batch methods.
arXiv Detail & Related papers (2020-10-12T17:41:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.