Eigendecomposition-Free Training of Deep Networks for Linear
Least-Square Problems
- URL: http://arxiv.org/abs/2004.07931v1
- Date: Wed, 15 Apr 2020 04:29:34 GMT
- Title: Eigendecomposition-Free Training of Deep Networks for Linear
Least-Square Problems
- Authors: Zheng Dang, Kwang Moo Yi, Yinlin Hu, Fei Wang, Pascal Fua and Mathieu
Salzmann
- Abstract summary: We introduce an eigendecomposition-free approach to training a deep network.
We show that our approach is much more robust than explicit differentiation of the eigendecomposition.
Our method has better convergence properties and yields state-of-the-art results.
- Score: 107.3868459697569
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many classical Computer Vision problems, such as essential matrix computation
and pose estimation from 3D to 2D correspondences, can be tackled by solving a
linear least-square problem, which can be done by finding the eigenvector
corresponding to the smallest, or zero, eigenvalue of a matrix representing a
linear system. Incorporating this in deep learning frameworks would allow us to
explicitly encode known notions of geometry, instead of having the network
implicitly learn them from data. However, performing eigendecomposition within
a network requires the ability to differentiate this operation. While
theoretically doable, this introduces numerical instability in the optimization
process in practice. In this paper, we introduce an eigendecomposition-free
approach to training a deep network whose loss depends on the eigenvector
corresponding to a zero eigenvalue of a matrix predicted by the network. We
demonstrate that our approach is much more robust than explicit differentiation
of the eigendecomposition using two general tasks, outlier rejection and
denoising, with several practical examples including wide-baseline stereo, the
perspective-n-point problem, and ellipse fitting. Empirically, our method has
better convergence properties and yields state-of-the-art results.
Related papers
- A Library of Mirrors: Deep Neural Nets in Low Dimensions are Convex Lasso Models with Reflection Features [54.83898311047626]
We consider neural networks with piecewise linear activations ranging from 2 to an arbitrary but finite number of layers.
We first show that two-layer networks with piecewise linear activations are Lasso models using a discrete dictionary of ramp depths.
arXiv Detail & Related papers (2024-03-02T00:33:45Z) - Hessian Eigenvectors and Principal Component Analysis of Neural Network
Weight Matrices [0.0]
This study delves into the intricate dynamics of trained deep neural networks and their relationships with network parameters.
We unveil a correlation between Hessian eigenvectors and network weights.
This relationship, hinging on the magnitude of eigenvalues, allows us to discern parameter directions within the network.
arXiv Detail & Related papers (2023-11-01T11:38:31Z) - Neural Networks Based on Power Method and Inverse Power Method for
Solving Linear Eigenvalue Problems [4.3209899858935366]
We propose two kinds of neural networks inspired by power method and inverse power method to solve linear eigenvalue problems.
The eigenfunction of the eigenvalue problem is learned by the neural network.
We show that accurate eigenvalue and eigenfunction approximations can be obtained by our methods.
arXiv Detail & Related papers (2022-09-22T16:22:11Z) - Deep Equilibrium Assisted Block Sparse Coding of Inter-dependent
Signals: Application to Hyperspectral Imaging [71.57324258813675]
A dataset of inter-dependent signals is defined as a matrix whose columns demonstrate strong dependencies.
A neural network is employed to act as structure prior and reveal the underlying signal interdependencies.
Deep unrolling and Deep equilibrium based algorithms are developed, forming highly interpretable and concise deep-learning-based architectures.
arXiv Detail & Related papers (2022-03-29T21:00:39Z) - Path Regularization: A Convexity and Sparsity Inducing Regularization
for Parallel ReLU Networks [75.33431791218302]
We study the training problem of deep neural networks and introduce an analytic approach to unveil hidden convexity in the optimization landscape.
We consider a deep parallel ReLU network architecture, which also includes standard deep networks and ResNets as its special cases.
arXiv Detail & Related papers (2021-10-18T18:00:36Z) - Global Optimality Beyond Two Layers: Training Deep ReLU Networks via
Convex Programs [39.799125462526234]
We develop a novel unified framework to reveal a hidden regularization mechanism through the lens of convex optimization.
We numerically validate our theoretical results via experiments involving both synthetic and real datasets.
arXiv Detail & Related papers (2021-10-11T18:00:30Z) - Relative gradient optimization of the Jacobian term in unsupervised deep
learning [9.385902422987677]
Learning expressive probabilistic models correctly describing the data is a ubiquitous problem in machine learning.
Deep density models have been widely used for this task, but their maximum likelihood based training requires estimating the log-determinant of the Jacobian.
We propose a new approach for exact training of such neural networks.
arXiv Detail & Related papers (2020-06-26T16:41:08Z) - Semiparametric Nonlinear Bipartite Graph Representation Learning with
Provable Guarantees [106.91654068632882]
We consider the bipartite graph and formalize its representation learning problem as a statistical estimation problem of parameters in a semiparametric exponential family distribution.
We show that the proposed objective is strongly convex in a neighborhood around the ground truth, so that a gradient descent-based method achieves linear convergence rate.
Our estimator is robust to any model misspecification within the exponential family, which is validated in extensive experiments.
arXiv Detail & Related papers (2020-03-02T16:40:36Z) - Convex Geometry and Duality of Over-parameterized Neural Networks [70.15611146583068]
We develop a convex analytic approach to analyze finite width two-layer ReLU networks.
We show that an optimal solution to the regularized training problem can be characterized as extreme points of a convex set.
In higher dimensions, we show that the training problem can be cast as a finite dimensional convex problem with infinitely many constraints.
arXiv Detail & Related papers (2020-02-25T23:05:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.