Universal Approximation Properties for an ODENet and a ResNet:
Mathematical Analysis and Numerical Experiments
- URL: http://arxiv.org/abs/2101.10229v3
- Date: Thu, 18 May 2023 01:22:34 GMT
- Title: Universal Approximation Properties for an ODENet and a ResNet:
Mathematical Analysis and Numerical Experiments
- Authors: Yuto Aizawa, Masato Kimura, and Kazunori Matsui
- Abstract summary: We prove a universal approximation property (UAP) for a class of ODENet and a class of ResNet.
We use this to construct a learning algorithm for ODENet.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We prove a universal approximation property (UAP) for a class of ODENet and a
class of ResNet, which are simplified mathematical models for deep learning
systems with skip connections. The UAP can be stated as follows. Let $n$ and
$m$ be the dimension of input and output data, and assume $m\leq n$. Then we
show that ODENet of width $n+m$ with any non-polynomial continuous activation
function can approximate any continuous function on a compact subset on
$\mathbb{R}^n$. We also show that ResNet has the same property as the depth
tends to infinity. Furthermore, we derive the gradient of a loss function
explicitly with respect to a certain tuning variable. We use this to construct
a learning algorithm for ODENet. To demonstrate the usefulness of this
algorithm, we apply it to a regression problem, a binary classification, and a
multinomial classification in MNIST.
Related papers
- Minimum Width of Deep Narrow Networks for Universal Approximation [9.00733527455972]
We study the lower bounds and upper bounds of the minimum width required for fully connected neural networks.<n>We present a new proof of the inequality $w_minge d_y+mathbf1_d_xd_yleq2d_x$ by constructing a more intuitive example.
arXiv Detail & Related papers (2025-11-10T08:29:14Z) - The Hidden Width of Deep ResNets: Tight Error Bounds and Phase Diagrams [15.246178589173523]
We study the gradient-based training of large-depth residual networks (ResNets)<n>We show that with a diverging depth $L$, a fixed embedding dimension $D$, and an arbitrary hidden width $M$, the training dynamics converges to a Neural Mean ODE training dynamics.
arXiv Detail & Related papers (2025-09-12T11:51:44Z) - New advances in universal approximation with neural networks of minimal width [4.424170214926035]
We show that autoencoders with leaky ReLU activations are universal approximators of $Lp$ functions.
We broaden our results to show that smooth invertible neural networks can approximate $Lp(mathbbRd,mathbbRd)$ on compacta.
arXiv Detail & Related papers (2024-11-13T16:17:16Z) - Universal approximation property of ODENet and ResNet with a single activation function [0.0]
We study a universal approximation property of ODENet and ResNet.
We show that such an ODENet and ResNet with a restricted vector field can uniformly approximate ODENet with a general vector field.
arXiv Detail & Related papers (2024-10-22T05:27:01Z) - Deep Neural Networks: Multi-Classification and Universal Approximation [0.0]
We demonstrate that a ReLU deep neural network with a width of $2$ and a depth of $2N+4M-1$ layers can achieve finite sample memorization for any dataset comprising $N$ elements.
We also provide depth estimates for approximating $W1,p$ functions and width estimates for approximating $Lp(Omega;mathbbRm)$ for $mgeq1$.
arXiv Detail & Related papers (2024-09-10T14:31:21Z) - Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit [75.4661041626338]
We study the problem of gradient descent learning of a single-index target function $f_*(boldsymbolx) = textstylesigma_*left(langleboldsymbolx,boldsymbolthetarangleright)$
We prove that a two-layer neural network optimized by an SGD-based algorithm learns $f_*$ with a complexity that is not governed by information exponents.
arXiv Detail & Related papers (2024-06-03T17:56:58Z) - A Minimal Control Family of Dynamical Syetem for Universal Approximation [6.164223149261533]
We show that a control family can generate flow maps that can approximate diffeomorphisms of $mathbbRd$ on any compact domain.
Our result reveals an underlying connection between the approximation power of neural networks and control systems.
arXiv Detail & Related papers (2023-12-20T10:36:55Z) - Learning Hierarchical Polynomials with Three-Layer Neural Networks [56.71223169861528]
We study the problem of learning hierarchical functions over the standard Gaussian distribution with three-layer neural networks.
For a large subclass of degree $k$s $p$, a three-layer neural network trained via layerwise gradientp descent on the square loss learns the target $h$ up to vanishing test error.
This work demonstrates the ability of three-layer neural networks to learn complex features and as a result, learn a broad class of hierarchical functions.
arXiv Detail & Related papers (2023-11-23T02:19:32Z) - A Unified Scheme of ResNet and Softmax [8.556540804058203]
We provide a theoretical analysis of the regression problem: $| langle exp(Ax) + A x, bf 1_n rangle-1 ( exp(Ax) + Ax )
This regression problem is a unified scheme that combines softmax regression and ResNet, which has never been done before.
arXiv Detail & Related papers (2023-09-23T21:41:01Z) - Achieve the Minimum Width of Neural Networks for Universal Approximation [1.52292571922932]
We study the exact minimum width, $w_min$, for the universal approximation property (UAP) of neural networks.
In particular, the critical width, $w*_min$, for $Lp$-UAP can be achieved by leaky-ReLU networks.
arXiv Detail & Related papers (2022-09-23T04:03:50Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - On Function Approximation in Reinforcement Learning: Optimism in the
Face of Large State Spaces [208.67848059021915]
We study the exploration-exploitation tradeoff at the core of reinforcement learning.
In particular, we prove that the complexity of the function class $mathcalF$ characterizes the complexity of the function.
Our regret bounds are independent of the number of episodes.
arXiv Detail & Related papers (2020-11-09T18:32:22Z) - Interval Universal Approximation for Neural Networks [47.767793120249095]
We introduce the interval universal approximation (IUA) theorem.
IUA shows that neural networks can approximate any continuous function $f$ as we have known for decades.
We study the computational complexity of constructing neural networks that are amenable to precise interval analysis.
arXiv Detail & Related papers (2020-07-12T20:43:56Z) - Deep Polynomial Neural Networks [77.70761658507507]
$Pi$Nets are a new class of function approximators based on expansions.
$Pi$Nets produce state-the-art results in three challenging tasks, i.e. image generation, face verification and 3D mesh representation learning.
arXiv Detail & Related papers (2020-06-20T16:23:32Z) - Reinforcement Learning with General Value Function Approximation:
Provably Efficient Approach via Bounded Eluder Dimension [124.7752517531109]
We establish a provably efficient reinforcement learning algorithm with general value function approximation.
We show that our algorithm achieves a regret bound of $widetildeO(mathrmpoly(dH)sqrtT)$ where $d$ is a complexity measure.
Our theory generalizes recent progress on RL with linear value function approximation and does not make explicit assumptions on the model of the environment.
arXiv Detail & Related papers (2020-05-21T17:36:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.