On the Approximation of Bi-Lipschitz Maps by Invertible Neural Networks
- URL: http://arxiv.org/abs/2308.09367v1
- Date: Fri, 18 Aug 2023 08:01:45 GMT
- Title: On the Approximation of Bi-Lipschitz Maps by Invertible Neural Networks
- Authors: Bangti Jin and Zehui Zhou and Jun Zou
- Abstract summary: Invertible neural networks (INNs) represent an important class of deep neural network architectures.
We provide an analysis of the capacity of a class of coupling-based INNs to approximate bi-Lipschitz continuous mappings on a compact domain.
We develop an approach for approximating bi-Lipschitz maps on infinite-dimensional spaces that simultaneously approximate the forward and inverse maps.
- Score: 3.7072693116122752
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Invertible neural networks (INNs) represent an important class of deep neural
network architectures that have been widely used in several applications. The
universal approximation properties of INNs have also been established recently.
However, the approximation rate of INNs is largely missing. In this work, we
provide an analysis of the capacity of a class of coupling-based INNs to
approximate bi-Lipschitz continuous mappings on a compact domain, and the
result shows that it can well approximate both forward and inverse maps
simultaneously. Furthermore, we develop an approach for approximating
bi-Lipschitz maps on infinite-dimensional spaces that simultaneously
approximate the forward and inverse maps, by combining model reduction with
principal component analysis and INNs for approximating the reduced map, and we
analyze the overall approximation error of the approach. Preliminary numerical
results show the feasibility of the approach for approximating the solution
operator for parameterized second-order elliptic problems.
Related papers
- Information-Theoretic Generalization Bounds for Deep Neural Networks [22.87479366196215]
Deep neural networks (DNNs) exhibit an exceptional capacity for generalization in practical applications.
This work aims to capture the effect and benefits of depth for supervised learning via information-theoretic generalization bounds.
arXiv Detail & Related papers (2024-04-04T03:20:35Z) - Learning Traveling Solitary Waves Using Separable Gaussian Neural
Networks [0.9065034043031668]
We apply a machine-learning approach to learn traveling solitary waves across various families of partial differential equations (PDEs)
Our approach integrates a novel interpretable neural network (NN) architecture into the framework of Physics-Informed Neural Networks (PINNs)
arXiv Detail & Related papers (2024-03-07T20:16:18Z) - Deep Architecture Connectivity Matters for Its Convergence: A
Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training.
We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z) - Input Convex Gradient Networks [7.747759814657507]
We study how to model convex gradients by integrating a Jacobian-vector product parameterized by a neural network.
We empirically demonstrate that a single layer ICGN can fit a toy example better than a single layer ICNN.
arXiv Detail & Related papers (2021-11-23T22:51:25Z) - Proof of the Theory-to-Practice Gap in Deep Learning via Sampling
Complexity bounds for Neural Network Approximation Spaces [5.863264019032882]
We study the computational complexity of (deterministic or randomized) algorithms based on approximating or integrating functions.
One of the most important problems in this field concerns the question of whether it is possible to realize theoretically provable neural network approximation rates.
arXiv Detail & Related papers (2021-04-06T18:55:20Z) - A Convergence Theory Towards Practical Over-parameterized Deep Neural
Networks [56.084798078072396]
We take a step towards closing the gap between theory and practice by significantly improving the known theoretical bounds on both the network width and the convergence time.
We show that convergence to a global minimum is guaranteed for networks with quadratic widths in the sample size and linear in their depth at a time logarithmic in both.
Our analysis and convergence bounds are derived via the construction of a surrogate network with fixed activation patterns that can be transformed at any time to an equivalent ReLU network of a reasonable size.
arXiv Detail & Related papers (2021-01-12T00:40:45Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Fast Learning of Graph Neural Networks with Guaranteed Generalizability:
One-hidden-layer Case [93.37576644429578]
Graph neural networks (GNNs) have made great progress recently on learning from graph-structured data in practice.
We provide a theoretically-grounded generalizability analysis of GNNs with one hidden layer for both regression and binary classification problems.
arXiv Detail & Related papers (2020-06-25T00:45:52Z) - Multipole Graph Neural Operator for Parametric Partial Differential
Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data.
We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity.
Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z) - Measuring Model Complexity of Neural Networks with Curve Activation
Functions [100.98319505253797]
We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function.
We experimentally explore the training process of neural networks and detect overfitting.
We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
arXiv Detail & Related papers (2020-06-16T07:38:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.