Fixed Point Networks: Implicit Depth Models with Jacobian-Free Backprop
- URL: http://arxiv.org/abs/2103.12803v1
- Date: Tue, 23 Mar 2021 19:20:33 GMT
- Title: Fixed Point Networks: Implicit Depth Models with Jacobian-Free Backprop
- Authors: Samy Wu Fung, Howard Heaton, Qiuwei Li, Daniel McKenzie, Stanley
Osher, Wotao Yin
- Abstract summary: A growing trend in deep learning replaces fixed depth models by approximations of the limit as network depth approaches infinity.
In particular, backpropagation through implicit depth models requires solving a Jacobian-based equation arising from the implicit function theorem.
We propose fixed point networks (FPNs) that guarantees convergence of forward propagation to a unique limit defined by network weights and input data.
- Score: 21.00060644438722
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A growing trend in deep learning replaces fixed depth models by
approximations of the limit as network depth approaches infinity. This approach
uses a portion of network weights to prescribe behavior by defining a limit
condition. This makes network depth implicit, varying based on the provided
data and an error tolerance. Moreover, existing implicit models can be
implemented and trained with fixed memory costs in exchange for additional
computational costs. In particular, backpropagation through implicit depth
models requires solving a Jacobian-based equation arising from the implicit
function theorem. We propose fixed point networks (FPNs), a simple setup for
implicit depth learning that guarantees convergence of forward propagation to a
unique limit defined by network weights and input data. Our key contribution is
to provide a new Jacobian-free backpropagation (JFB) scheme that circumvents
the need to solve Jacobian-based equations while maintaining fixed memory
costs. This makes FPNs much cheaper to train and easy to implement. Our
numerical examples yield state of the art classification results for implicit
depth models and outperform corresponding explicit models.
Related papers
- Adaptive Multilevel Neural Networks for Parametric PDEs with Error Estimation [0.0]
A neural network architecture is presented to solve high-dimensional parameter-dependent partial differential equations (pPDEs)
It is constructed to map parameters of the model data to corresponding finite element solutions.
It outputs a coarse grid solution and a series of corrections as produced in an adaptive finite element method (AFEM)
arXiv Detail & Related papers (2024-03-19T11:34:40Z) - Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures.
This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead.
We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z) - Robust Training and Verification of Implicit Neural Networks: A
Non-Euclidean Contractive Approach [64.23331120621118]
This paper proposes a theoretical and computational framework for training and robustness verification of implicit neural networks.
We introduce a related embedded network and show that the embedded network can be used to provide an $ell_infty$-norm box over-approximation of the reachable sets of the original network.
We apply our algorithms to train implicit neural networks on the MNIST dataset and compare the robustness of our models with the models trained via existing approaches in the literature.
arXiv Detail & Related papers (2022-08-08T03:13:24Z) - Gradient Descent Optimizes Infinite-Depth ReLU Implicit Networks with
Linear Widths [25.237054775800164]
This paper studies the convergence of gradient flow and gradient descent for nonlinear ReLU activated implicit networks.
We prove that both GF and GD converge to a global minimum at a linear rate if the width $m$ of the implicit network is textitlinear in the sample size.
arXiv Detail & Related papers (2022-05-16T06:07:56Z) - Deep Capsule Encoder-Decoder Network for Surrogate Modeling and
Uncertainty Quantification [0.0]
The proposed framework is developed by adapting Capsule Network (CapsNet) architecture into image-to-image regression encoder-decoder network.
The obtained results from performance evaluation indicate that the proposed approach is accurate, efficient, and robust.
arXiv Detail & Related papers (2022-01-19T17:45:01Z) - Improving Robustness and Uncertainty Modelling in Neural Ordinary
Differential Equations [0.2538209532048866]
We propose a novel approach to model uncertainty in NODE by considering a distribution over the end-time $T$ of the ODE solver.
We also propose, adaptive latent time NODE (ALT-NODE), which allow each data point to have a distinct posterior distribution over end-times.
We demonstrate the effectiveness of the proposed approaches in modelling uncertainty and robustness through experiments on synthetic and several real-world image classification data.
arXiv Detail & Related papers (2021-12-23T16:56:10Z) - Robustness Certificates for Implicit Neural Networks: A Mixed Monotone
Contractive Approach [60.67748036747221]
Implicit neural networks offer competitive performance and reduced memory consumption.
They can remain brittle with respect to input adversarial perturbations.
This paper proposes a theoretical and computational framework for robustness verification of implicit neural networks.
arXiv Detail & Related papers (2021-12-10T03:08:55Z) - Stabilizing Equilibrium Models by Jacobian Regularization [151.78151873928027]
Deep equilibrium networks (DEQs) are a new class of models that eschews traditional depth in favor of finding the fixed point of a single nonlinear layer.
We propose a regularization scheme for DEQ models that explicitly regularizes the Jacobian of the fixed-point update equations to stabilize the learning of equilibrium models.
We show that this regularization adds only minimal computational cost, significantly stabilizes the fixed-point convergence in both forward and backward passes, and scales well to high-dimensional, realistic domains.
arXiv Detail & Related papers (2021-06-28T00:14:11Z) - Robust Implicit Networks via Non-Euclidean Contractions [63.91638306025768]
Implicit neural networks show improved accuracy and significant reduction in memory consumption.
They can suffer from ill-posedness and convergence instability.
This paper provides a new framework to design well-posed and robust implicit neural networks.
arXiv Detail & Related papers (2021-06-06T18:05:02Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.