Related papers: Input Convex Gradient Networks

Input Convex Gradient Networks

URL: http://arxiv.org/abs/2111.12187v1
Date: Tue, 23 Nov 2021 22:51:25 GMT
Title: Input Convex Gradient Networks
Authors: Jack Richter-Powell, Jonathan Lorraine, Brandon Amos
Abstract summary: We study how to model convex gradients by integrating a Jacobian-vector product parameterized by a neural network. We empirically demonstrate that a single layer ICGN can fit a toy example better than a single layer ICNN.
Score: 7.747759814657507
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The gradients of convex functions are expressive models of non-trivial vector fields. For example, Brenier's theorem yields that the optimal transport map between any two measures on Euclidean space under the squared distance is realized as a convex gradient, which is a key insight used in recent generative flow models. In this paper, we study how to model convex gradients by integrating a Jacobian-vector product parameterized by a neural network, which we call the Input Convex Gradient Network (ICGN). We theoretically study ICGNs and compare them to taking the gradient of an Input-Convex Neural Network (ICNN), empirically demonstrating that a single layer ICGN can fit a toy example better than a single layer ICNN. Lastly, we explore extensions to deeper networks and connections to constructions from Riemannian geometry.

Related papers

Convexity in ReLU Neural Networks: beyond ICNNs? [17.01649106055384]
We show that every convex function implemented by a 1-hidden-layer ReLU network can be expressed by an ICNN with the same architecture. We also provide a numerical procedure that allows an exact check of convexity for ReLU neural networks with a large number of affine regions.
arXiv Detail & Related papers (2025-01-06T13:53:59Z)
Understanding the training of infinitely deep and wide ResNets with Conditional Optimal Transport [26.47265060394168]
We show that the gradient flow for deep neural networks converges arbitrarily at a distance ofr. This is done by relying on the theory of gradient distance of finite width in spaces.
arXiv Detail & Related papers (2024-03-19T16:34:31Z)
The Convex Landscape of Neural Networks: Characterizing Global Optima and Stationary Points via Lasso Models [75.33431791218302]
Deep Neural Network Network (DNN) models are used for programming purposes. In this paper we examine the use of convex neural recovery models. We show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program. We also show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
arXiv Detail & Related papers (2023-12-19T23:04:56Z)
Identification of vortex in unstructured mesh with graph neural networks [0.0]
We present a Graph Neural Network (GNN) based model with U-Net architecture to identify the vortex in CFD results on unstructured meshes. A vortex auto-labeling method is proposed to label vortex regions in 2D CFD meshes.
arXiv Detail & Related papers (2023-11-11T12:10:16Z)
Approximation Results for Gradient Descent trained Neural Networks [0.0]
The networks are fully connected constant depth increasing width. The continuous kernel error norm implies an approximation under the natural smoothness assumption required for smooth functions.
arXiv Detail & Related papers (2023-09-09T18:47:55Z)
On the Approximation of Bi-Lipschitz Maps by Invertible Neural Networks [3.7072693116122752]
Invertible neural networks (INNs) represent an important class of deep neural network architectures. We provide an analysis of the capacity of a class of coupling-based INNs to approximate bi-Lipschitz continuous mappings on a compact domain. We develop an approach for approximating bi-Lipschitz maps on infinite-dimensional spaces that simultaneously approximate the forward and inverse maps.
arXiv Detail & Related papers (2023-08-18T08:01:45Z)
Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training. We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z)
On Feature Learning in Neural Networks with Global Convergence Guarantees [49.870593940818715]
We study the optimization of wide neural networks (NNs) via gradient flow (GF) We show that when the input dimension is no less than the size of the training set, the training loss converges to zero at a linear rate under GF. We also show empirically that, unlike in the Neural Tangent Kernel (NTK) regime, our multi-layer model exhibits feature learning and can achieve better generalization performance than its NTK counterpart.
arXiv Detail & Related papers (2022-04-22T15:56:43Z)
A Differential Geometry Perspective on Orthogonal Recurrent Models [56.09491978954866]
We employ tools and insights from differential geometry to offer a novel perspective on orthogonal RNNs. We show that orthogonal RNNs may be viewed as optimizing in the space of divergence-free vector fields. Motivated by this observation, we study a new recurrent model, which spans the entire space of vector fields.
arXiv Detail & Related papers (2021-02-18T19:39:22Z)
Convex Optimization with an Interpolation-based Projection and its Application to Deep Learning [36.19092177858517]
We investigate whether an inexact, but cheaper projection can drive a descent algorithm to an optimum. Specifically, we propose an inexact-based projection that is computationally cheap and easy to compute given a convex, domain defining, function.
arXiv Detail & Related papers (2020-11-13T16:52:50Z)
Binarized Graph Neural Network [65.20589262811677]
We develop a binarized graph neural network to learn the binary representations of the nodes with binary network parameters. Our proposed method can be seamlessly integrated into the existing GNN-based embedding approaches. Experiments indicate that the proposed binarized graph neural network, namely BGN, is orders of magnitude more efficient in terms of both time and space.
arXiv Detail & Related papers (2020-04-19T09:43:14Z)
Optimal Gradient Quantization Condition for Communication-Efficient Distributed Training [99.42912552638168]
Communication of gradients is costly for training deep neural networks with multiple devices in computer vision applications. In this work, we deduce the optimal condition of both the binary and multi-level gradient quantization for textbfANY gradient distribution. Based on the optimal condition, we develop two novel quantization schemes: biased BinGrad and unbiased ORQ for binary and multi-level gradient quantization respectively.
arXiv Detail & Related papers (2020-02-25T18:28:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.