Fixed-Point Automatic Differentiation of Forward--Backward Splitting
Algorithms for Partly Smooth Functions
- URL: http://arxiv.org/abs/2208.03107v1
- Date: Fri, 5 Aug 2022 11:27:55 GMT
- Title: Fixed-Point Automatic Differentiation of Forward--Backward Splitting
Algorithms for Partly Smooth Functions
- Authors: Sheheryar Mehmood and Peter Ochs
- Abstract summary: We show that under partial smoothness, Automatic Differentiation converges to the derivative of the solution mapping.
We numerically illustrate the convergence and convergence rates of AD and FPAD on Lasso and Group Lasso problems.
- Score: 8.680676599607123
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A large class of non-smooth practical optimization problems can be written as
minimization of a sum of smooth and partly smooth functions. We consider such
structured problems which also depend on a parameter vector and study the
problem of differentiating its solution mapping with respect to the parameter
which has far reaching applications in sensitivity analysis and parameter
learning optmization problems. We show that under partial smoothness and other
mild assumptions, Automatic Differentiation (AD) of the sequence generated by
proximal splitting algorithms converges to the derivative of the solution
mapping. For a variant of automatic differentiation, which we call Fixed-Point
Automatic Differentiation (FPAD), we remedy the memory overhead problem of the
Reverse Mode AD and moreover provide faster convergence theoretically. We
numerically illustrate the convergence and convergence rates of AD and FPAD on
Lasso and Group Lasso problems and demonstrate the working of FPAD on
prototypical practical image denoising problem by learning the regularization
term.
Related papers
- A Robust Algorithm for Non-IID Machine Learning Problems with Convergence Analysis [2.4462606119036456]
We propose an improved numerical algorithm for solving minimax problems based on nonsmooth optimization, quadratic programming and iterative process.<n>Such an algorithm can be widely applied in various fields such as robust optimization, imbalanced learning, etc.
arXiv Detail & Related papers (2025-07-01T14:41:59Z) - The Distributionally Robust Optimization Model of Sparse Principal Component Analysis [7.695578200868269]
We consider sparse principal component analysis (PCA) under a setting where the underlying probability distribution of the random parameter is uncertain.
This problem is formulated as a distributionally robust optimization (DRO) model based on a constructive approach to capturing uncertainty.
We prove that the inner problem admits a closed-form solution, reformulating the original DRO model into an equivalent minimization problem on the Stiefel manifold.
arXiv Detail & Related papers (2025-03-04T11:00:08Z) - A Globally Convergent Algorithm for Neural Network Parameter
Optimization Based on Difference-of-Convex Functions [29.58728073957055]
We propose an algorithm for optimizing parameters of hidden layer networks.
Specifically, we derive a blockwise (DC-of-the-art) difference function.
arXiv Detail & Related papers (2024-01-15T19:53:35Z) - Moreau Envelope ADMM for Decentralized Weakly Convex Optimization [55.2289666758254]
This paper proposes a proximal variant of the alternating direction method of multipliers (ADMM) for distributed optimization.
The results of our numerical experiments indicate that our method is faster and more robust than widely-used approaches.
arXiv Detail & Related papers (2023-08-31T14:16:30Z) - Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets.
We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z) - Adaptive Stochastic Optimisation of Nonconvex Composite Objectives [2.1700203922407493]
We propose and analyse a family of generalised composite mirror descent algorithms.
With adaptive step sizes, the proposed algorithms converge without requiring prior knowledge of the problem.
We exploit the low-dimensional structure of the decision sets for high-dimensional problems.
arXiv Detail & Related papers (2022-11-21T18:31:43Z) - Learning to Optimize with Stochastic Dominance Constraints [103.26714928625582]
In this paper, we develop a simple yet efficient approach for the problem of comparing uncertain quantities.
We recast inner optimization in the Lagrangian as a learning problem for surrogate approximation, which bypasses apparent intractability.
The proposed light-SD demonstrates superior performance on several representative problems ranging from finance to supply chain management.
arXiv Detail & Related papers (2022-11-14T21:54:31Z) - TiAda: A Time-scale Adaptive Algorithm for Nonconvex Minimax
Optimization [24.784754071913255]
Adaptive methods have shown their ability to adjust the stepsizes on the fly in a parameter-agnostic manner.
Current convergence analyses of gradient ascent for non-concave minimax problems require careful tuning of parameters.
arXiv Detail & Related papers (2022-10-31T17:05:36Z) - Continuation Path with Linear Convergence Rate [18.405645120971496]
Path-following algorithms are frequently used in composite optimization problems where a series of subproblems are solved sequentially.
We present a primal dual analysis of the path-following algorithm as well as determining how accurately each subproblem should be solved to guarantee a linear convergence rate on a target problem.
Considering optimization with a sparsity-inducing penalty, we analyze the change of the active sets with respect to the regularization parameter.
arXiv Detail & Related papers (2021-12-09T18:42:13Z) - Two-Stage Stochastic Optimization via Primal-Dual Decomposition and Deep
Unrolling [86.85697555068168]
Two-stage algorithmic optimization plays a critical role in various engineering and scientific applications.
There still lack efficient algorithms, especially when the long-term and short-term variables are coupled in the constraints.
We show that PDD-SSCA can achieve superior performance over existing solutions.
arXiv Detail & Related papers (2021-05-05T03:36:00Z) - Efficient Consensus Model based on Proximal Gradient Method applied to
Convolutional Sparse Problems [2.335152769484957]
We derive and detail a theoretical analysis of an efficient consensus algorithm based on gradient proximal (PG) approach.
The proposed algorithm is also applied to another particular convolutional problem for the anomaly detection task.
arXiv Detail & Related papers (2020-11-19T20:52:48Z) - A Multi-Agent Primal-Dual Strategy for Composite Optimization over
Distributed Features [52.856801164425086]
We study multi-agent sharing optimization problems with the objective function being the sum of smooth local functions plus a convex (possibly non-smooth) coupling function.
arXiv Detail & Related papers (2020-06-15T19:40:24Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.