Automatic differentiation of nonsmooth iterative algorithms
- URL: http://arxiv.org/abs/2206.00457v1
- Date: Tue, 31 May 2022 07:58:37 GMT
- Title: Automatic differentiation of nonsmooth iterative algorithms
- Authors: J\'er\^ome Bolte (TSE), Edouard Pauwels (IRIT), Samuel Vaiter (JAD)
- Abstract summary: We study nonsmooth piggyback automatic differentiation (AD) under appropriate nonexpansivity conditions.
We show that the attractor set of nonsmooth piggyback iterations is a set-valued fixed point which remains in the conservative framework.
Our results are illustrated on parametric convex optimization problems with forward-backward, Douglas-Rachford and Alternating Direction of Multiplier algorithms as well as the Heavy-Ball method.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Differentiation along algorithms, i.e., piggyback propagation of derivatives,
is now routinely used to differentiate iterative solvers in differentiable
programming. Asymptotics is well understood for many smooth problems but the
nondifferentiable case is hardly considered. Is there a limiting object for
nonsmooth piggyback automatic differentiation (AD)? Does it have any
variational meaning and can it be used effectively in machine learning? Is
there a connection with classical derivative? All these questions are addressed
under appropriate nonexpansivity conditions in the framework of conservative
derivatives which has proved useful in understanding nonsmooth AD. For
nonsmooth piggyback iterations, we characterize the attractor set of nonsmooth
piggyback iterations as a set-valued fixed point which remains in the
conservative framework. This has various consequences and in particular almost
everywhere convergence of classical derivatives. Our results are illustrated on
parametric convex optimization problems with forward-backward, Douglas-Rachford
and Alternating Direction of Multiplier algorithms as well as the Heavy-Ball
method.
Related papers
- Nonsmooth Implicit Differentiation: Deterministic and Stochastic Convergence Rates [34.81849268839475]
We study the problem of efficiently computing the derivative of the fixed-point of a parametric nondifferentiable contraction map.
We analyze two popular approaches: iterative differentiation (ITD) and approximate implicit differentiation (AID)
We establish rates for the convergence of NSID, encompassing the best available rates in the smooth setting.
arXiv Detail & Related papers (2024-03-18T11:37:53Z) - Stochastic Inexact Augmented Lagrangian Method for Nonconvex Expectation
Constrained Optimization [88.0031283949404]
Many real-world problems have complicated non functional constraints and use a large number of data points.
Our proposed method outperforms an existing method with the previously best-known result.
arXiv Detail & Related papers (2022-12-19T14:48:54Z) - Alternating Differentiation for Optimization Layers [133.2668019610731]
We develop a new framework, named Alternating Differentiation (Alt-Diff), that differentiates optimization problems.
We show that Alt-Diff substantially decreases the dimensions of the Jacobian matrix especially for optimization with large-scale constraints.
We also propose to truncate Alt-Diff to further accelerate the computational speed.
arXiv Detail & Related papers (2022-10-03T11:32:13Z) - Fixed-Point Automatic Differentiation of Forward--Backward Splitting Algorithms for Partly Smooth Functions [4.389150156866014]
Implicit (ID) and Automatic Differentiation (AD) are applied to the fixed-point iterations of proximal splitting algorithms.
We show that AD of the sequence generated by these algorithms converges to the derivative of the solution mapping.
For a variant of automatic differentiation, which we call Fixed-Point Automatic Differentiation (FPAD), we remedy the memory overhead problem of the Reverse Mode AD.
arXiv Detail & Related papers (2022-08-05T11:27:55Z) - Gradient flows and randomised thresholding: sparse inversion and
classification [0.0]
Sparse inversion and classification problems are ubiquitous in modern data science and imaging.
In classification, we consider, e.g., the sum of a data fidelity term and a non-smooth Ginzburg--Landau energy.
Standard (sub)gradient descent methods have shown to be inefficient when approaching such problems.
arXiv Detail & Related papers (2022-03-22T09:21:14Z) - Nonsmooth Implicit Differentiation for Machine Learning and Optimization [0.0]
In view of training increasingly complex learning architectures, we establish a nonsmooth implicit function theorem with an operational calculus.
Our result applies to most practical problems (i.e., definable problems) provided that a nonsmooth form of the classical invertibility condition is fulfilled.
This approach allows for formal subdifferentiation: for instance, replacing derivatives by Clarke Jacobians in the usual differentiation formulas is fully justified.
arXiv Detail & Related papers (2021-06-08T13:59:47Z) - Efficient and Modular Implicit Differentiation [68.74748174316989]
We propose a unified, efficient and modular approach for implicit differentiation of optimization problems.
We show that seemingly simple principles allow to recover many recently proposed implicit differentiation methods and create new ones easily.
arXiv Detail & Related papers (2021-05-31T17:45:58Z) - Implicit differentiation for fast hyperparameter selection in non-smooth
convex learning [87.60600646105696]
We study first-order methods when the inner optimization problem is convex but non-smooth.
We show that the forward-mode differentiation of proximal gradient descent and proximal coordinate descent yield sequences of Jacobians converging toward the exact Jacobian.
arXiv Detail & Related papers (2021-05-04T17:31:28Z) - On Correctness of Automatic Differentiation for Non-Differentiable
Functions [14.222887950206658]
We show that autodiff systems are correct in any formal sense when applied to non-differentiable functions.
We propose a new type of derivatives, called intensional derivatives, and prove that these derivatives always exist and coincide with standard derivatives for almost all inputs.
In this way, we formally establish the correctness of autodiff systems applied to non-differentiable functions.
arXiv Detail & Related papers (2020-06-12T01:57:13Z) - Implicit differentiation of Lasso-type models for hyperparameter
optimization [82.73138686390514]
We introduce an efficient implicit differentiation algorithm, without matrix inversion, tailored for Lasso-type problems.
Our approach scales to high-dimensional data by leveraging the sparsity of the solutions.
arXiv Detail & Related papers (2020-02-20T18:43:42Z) - Learning with Differentiable Perturbed Optimizers [54.351317101356614]
We propose a systematic method to transform operations into operations that are differentiable and never locally constant.
Our approach relies on perturbeds, and can be used readily together with existing solvers.
We show how this framework can be connected to a family of losses developed in structured prediction, and give theoretical guarantees for their use in learning tasks.
arXiv Detail & Related papers (2020-02-20T11:11:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.