Input Convex Gradient Networks
- URL: http://arxiv.org/abs/2111.12187v1
- Date: Tue, 23 Nov 2021 22:51:25 GMT
- Title: Input Convex Gradient Networks
- Authors: Jack Richter-Powell, Jonathan Lorraine, Brandon Amos
- Abstract summary: We study how to model convex gradients by integrating a Jacobian-vector product parameterized by a neural network.
We empirically demonstrate that a single layer ICGN can fit a toy example better than a single layer ICNN.
- Score: 7.747759814657507
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The gradients of convex functions are expressive models of non-trivial vector
fields. For example, Brenier's theorem yields that the optimal transport map
between any two measures on Euclidean space under the squared distance is
realized as a convex gradient, which is a key insight used in recent generative
flow models. In this paper, we study how to model convex gradients by
integrating a Jacobian-vector product parameterized by a neural network, which
we call the Input Convex Gradient Network (ICGN). We theoretically study ICGNs
and compare them to taking the gradient of an Input-Convex Neural Network
(ICNN), empirically demonstrating that a single layer ICGN can fit a toy
example better than a single layer ICNN. Lastly, we explore extensions to
deeper networks and connections to constructions from Riemannian geometry.
Related papers
- Understanding the training of infinitely deep and wide ResNets with Conditional Optimal Transport [26.47265060394168]
We show that the gradient flow for deep neural networks converges arbitrarily at a distance ofr.
This is done by relying on the theory of gradient distance of finite width in spaces.
arXiv Detail & Related papers (2024-03-19T16:34:31Z) - The Convex Landscape of Neural Networks: Characterizing Global Optima
and Stationary Points via Lasso Models [75.33431791218302]
Deep Neural Network Network (DNN) models are used for programming purposes.
In this paper we examine the use of convex neural recovery models.
We show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
We also show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
arXiv Detail & Related papers (2023-12-19T23:04:56Z) - Identification of vortex in unstructured mesh with graph neural networks [0.0]
We present a Graph Neural Network (GNN) based model with U-Net architecture to identify the vortex in CFD results on unstructured meshes.
A vortex auto-labeling method is proposed to label vortex regions in 2D CFD meshes.
arXiv Detail & Related papers (2023-11-11T12:10:16Z) - Approximation Results for Gradient Descent trained Neural Networks [0.0]
The networks are fully connected constant depth increasing width.
The continuous kernel error norm implies an approximation under the natural smoothness assumption required for smooth functions.
arXiv Detail & Related papers (2023-09-09T18:47:55Z) - On the Approximation of Bi-Lipschitz Maps by Invertible Neural Networks [3.7072693116122752]
Invertible neural networks (INNs) represent an important class of deep neural network architectures.
We provide an analysis of the capacity of a class of coupling-based INNs to approximate bi-Lipschitz continuous mappings on a compact domain.
We develop an approach for approximating bi-Lipschitz maps on infinite-dimensional spaces that simultaneously approximate the forward and inverse maps.
arXiv Detail & Related papers (2023-08-18T08:01:45Z) - Deep Architecture Connectivity Matters for Its Convergence: A
Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training.
We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z) - On Feature Learning in Neural Networks with Global Convergence
Guarantees [49.870593940818715]
We study the optimization of wide neural networks (NNs) via gradient flow (GF)
We show that when the input dimension is no less than the size of the training set, the training loss converges to zero at a linear rate under GF.
We also show empirically that, unlike in the Neural Tangent Kernel (NTK) regime, our multi-layer model exhibits feature learning and can achieve better generalization performance than its NTK counterpart.
arXiv Detail & Related papers (2022-04-22T15:56:43Z) - A Differential Geometry Perspective on Orthogonal Recurrent Models [56.09491978954866]
We employ tools and insights from differential geometry to offer a novel perspective on orthogonal RNNs.
We show that orthogonal RNNs may be viewed as optimizing in the space of divergence-free vector fields.
Motivated by this observation, we study a new recurrent model, which spans the entire space of vector fields.
arXiv Detail & Related papers (2021-02-18T19:39:22Z) - Convex Optimization with an Interpolation-based Projection and its
Application to Deep Learning [36.19092177858517]
We investigate whether an inexact, but cheaper projection can drive a descent algorithm to an optimum.
Specifically, we propose an inexact-based projection that is computationally cheap and easy to compute given a convex, domain defining, function.
arXiv Detail & Related papers (2020-11-13T16:52:50Z) - Binarized Graph Neural Network [65.20589262811677]
We develop a binarized graph neural network to learn the binary representations of the nodes with binary network parameters.
Our proposed method can be seamlessly integrated into the existing GNN-based embedding approaches.
Experiments indicate that the proposed binarized graph neural network, namely BGN, is orders of magnitude more efficient in terms of both time and space.
arXiv Detail & Related papers (2020-04-19T09:43:14Z) - Optimal Gradient Quantization Condition for Communication-Efficient
Distributed Training [99.42912552638168]
Communication of gradients is costly for training deep neural networks with multiple devices in computer vision applications.
In this work, we deduce the optimal condition of both the binary and multi-level gradient quantization for textbfANY gradient distribution.
Based on the optimal condition, we develop two novel quantization schemes: biased BinGrad and unbiased ORQ for binary and multi-level gradient quantization respectively.
arXiv Detail & Related papers (2020-02-25T18:28:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.