An Accelerated Block Proximal Framework with Adaptive Momentum for
Nonconvex and Nonsmooth Optimization
- URL: http://arxiv.org/abs/2308.12126v2
- Date: Thu, 24 Aug 2023 14:16:35 GMT
- Title: An Accelerated Block Proximal Framework with Adaptive Momentum for
Nonconvex and Nonsmooth Optimization
- Authors: Weifeng Yang and Wenwen Min
- Abstract summary: We propose an accelerated block proximal linear framework with adaptive momentum (ABPL$+$) for nonsmooth and nonsmooth optimization.
We analyze the potential causes of the extrapolation step failing in some algorithms, and resolve this issue by enhancing the comparison process.
We extend our algorithm to any scenario involving updating the gradient step and the linear extrapolation step.
- Score: 2.323238724742687
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose an accelerated block proximal linear framework with adaptive
momentum (ABPL$^+$) for nonconvex and nonsmooth optimization. We analyze the
potential causes of the extrapolation step failing in some algorithms, and
resolve this issue by enhancing the comparison process that evaluates the
trade-off between the proximal gradient step and the linear extrapolation step
in our algorithm. Furthermore, we extends our algorithm to any scenario
involving updating block variables with positive integers, allowing each cycle
to randomly shuffle the update order of the variable blocks. Additionally,
under mild assumptions, we prove that ABPL$^+$ can monotonically decrease the
function value without strictly restricting the extrapolation parameters and
step size, demonstrates the viability and effectiveness of updating these
blocks in a random order, and we also more obviously and intuitively
demonstrate that the derivative set of the sequence generated by our algorithm
is a critical point set. Moreover, we demonstrate the global convergence as
well as the linear and sublinear convergence rates of our algorithm by
utilizing the Kurdyka-Lojasiewicz (K{\L}) condition. To enhance the
effectiveness and flexibility of our algorithm, we also expand the study to the
imprecise version of our algorithm and construct an adaptive extrapolation
parameter strategy, which improving its overall performance. We apply our
algorithm to multiple non-negative matrix factorization with the $\ell_0$ norm,
nonnegative tensor decomposition with the $\ell_0$ norm, and perform extensive
numerical experiments to validate its effectiveness and efficiency.
Related papers
- Dynamic Anisotropic Smoothing for Noisy Derivative-Free Optimization [0.0]
We propose a novel algorithm that extends the methods of ball smoothing and Gaussian smoothing for noisy derivative-free optimization.
The algorithm dynamically adapts the shape of the smoothing kernel to approximate the Hessian of the objective function around a local optimum.
arXiv Detail & Related papers (2024-05-02T21:04:20Z) - Accelerated First-Order Optimization under Nonlinear Constraints [73.2273449996098]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms.
An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z) - Cyclic Block Coordinate Descent With Variance Reduction for Composite
Nonconvex Optimization [26.218670461973705]
We propose methods for solving problems coordinate non-asymptotic gradient norm guarantees.
Our results demonstrate the efficacy of the proposed cyclic-reduced scheme in training deep neural nets.
arXiv Detail & Related papers (2022-12-09T19:17:39Z) - An Algebraically Converging Stochastic Gradient Descent Algorithm for
Global Optimization [14.336473214524663]
A key component in the algorithm is the randomness based on the value of the objective function.
We prove the convergence of the algorithm with an algebra and tuning in the parameter space.
We present several numerical examples to demonstrate the efficiency and robustness of the algorithm.
arXiv Detail & Related papers (2022-04-12T16:27:49Z) - Inequality Constrained Stochastic Nonlinear Optimization via Active-Set
Sequential Quadratic Programming [17.9230793188835]
We study nonlinear optimization problems with objective and deterministic equality and inequality constraints.
We propose an active-set sequentialAdaptive programming algorithm, using a differentiable exact augmented Lagrangian as the merit function.
The algorithm adaptively selects the parameters of augmented Lagrangian and performs line search to decide the stepsize.
arXiv Detail & Related papers (2021-09-23T17:12:17Z) - Optimal Rates for Random Order Online Optimization [60.011653053877126]
We study the citetgarber 2020online, where the loss functions may be chosen by an adversary, but are then presented online in a uniformly random order.
We show that citetgarber 2020online algorithms achieve the optimal bounds and significantly improve their stability.
arXiv Detail & Related papers (2021-06-29T09:48:46Z) - An Asymptotically Optimal Primal-Dual Incremental Algorithm for
Contextual Linear Bandits [129.1029690825929]
We introduce a novel algorithm improving over the state-of-the-art along multiple dimensions.
We establish minimax optimality for any learning horizon in the special case of non-contextual linear bandits.
arXiv Detail & Related papers (2020-10-23T09:12:47Z) - Exploiting Higher Order Smoothness in Derivative-free Optimization and
Continuous Bandits [99.70167985955352]
We study the problem of zero-order optimization of a strongly convex function.
We consider a randomized approximation of the projected gradient descent algorithm.
Our results imply that the zero-order algorithm is nearly optimal in terms of sample complexity and the problem parameters.
arXiv Detail & Related papers (2020-06-14T10:42:23Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z) - Adaptivity of Stochastic Gradient Methods for Nonconvex Optimization [71.03797261151605]
Adaptivity is an important yet under-studied property in modern optimization theory.
Our algorithm is proved to achieve the best-available convergence for non-PL objectives simultaneously while outperforming existing algorithms for PL objectives.
arXiv Detail & Related papers (2020-02-13T05:42:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.