Expressivity of Neural Networks via Chaotic Itineraries beyond
Sharkovsky's Theorem
- URL: http://arxiv.org/abs/2110.10295v1
- Date: Tue, 19 Oct 2021 22:28:27 GMT
- Title: Expressivity of Neural Networks via Chaotic Itineraries beyond
Sharkovsky's Theorem
- Authors: Clayton Sanford, and Vaggos Chatziafratis
- Abstract summary: Given a target function $f$, how large must a neural network be in order to approximate $f$?
Recent works examine this basic question on neural network textitexpressivity'' from the lens of dynamical systems.
- Score: 8.492084752803528
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Given a target function $f$, how large must a neural network be in order to
approximate $f$? Recent works examine this basic question on neural network
\textit{expressivity} from the lens of dynamical systems and provide novel
``depth-vs-width'' tradeoffs for a large family of functions $f$. They suggest
that such tradeoffs are governed by the existence of \textit{periodic} points
or \emph{cycles} in $f$. Our work, by further deploying dynamical systems
concepts, illuminates a more subtle connection between periodicity and
expressivity: we prove that periodic points alone lead to suboptimal
depth-width tradeoffs and we improve upon them by demonstrating that certain
``chaotic itineraries'' give stronger exponential tradeoffs, even in regimes
where previous analyses only imply polynomial gaps. Contrary to prior works,
our bounds are nearly-optimal, tighten as the period increases, and handle
strong notions of inapproximability (e.g., constant $L_1$ error). More broadly,
we identify a phase transition to the \textit{chaotic regime} that exactly
coincides with an abrupt shift in other notions of function complexity,
including VC-dimension and topological entropy.
Related papers
- Hamiltonian Mechanics of Feature Learning: Bottleneck Structure in Leaky ResNets [58.460298576330835]
We study Leaky ResNets, which interpolate between ResNets ($tildeLtoinfty$) and Fully-Connected nets ($tildeLtoinfty$)
In the infinite depth limit, we study'representation geodesics' $A_p$: continuous paths in representation space (similar to NeuralODEs)
We leverage this intuition to explain the emergence of a bottleneck structure, as observed in previous work.
arXiv Detail & Related papers (2024-05-27T18:15:05Z) - Temporal Aggregation and Propagation Graph Neural Networks for Dynamic
Representation [67.26422477327179]
Temporal graphs exhibit dynamic interactions between nodes over continuous time.
We propose a novel method of temporal graph convolution with the whole neighborhood.
Our proposed TAP-GNN outperforms existing temporal graph methods by a large margin in terms of both predictive performance and online inference latency.
arXiv Detail & Related papers (2023-04-15T08:17:18Z) - Excess risk bound for deep learning under weak dependence [0.0]
This paper considers deep neural networks for learning weakly dependent processes.
We derive the required depth, width and sparsity of a deep neural network to approximate any H"older smooth function.
arXiv Detail & Related papers (2023-02-15T07:23:48Z) - Signatures of the interplay between chaos and local criticality on the
dynamics of scrambling in many-body systems [0.0]
We study the interplay between local criticality and chaos right at the intricate phase space region where the integrability-chaos transition first appears.
Our aim is to investigate the dependence of the exponential growth of the OTOCs, defining the quantum Lyapunov exponent $lambda_textrmq$ on quantities derived from the classical system.
arXiv Detail & Related papers (2022-11-22T10:26:17Z) - Understanding Deep Neural Function Approximation in Reinforcement
Learning via $\epsilon$-Greedy Exploration [53.90873926758026]
This paper provides a theoretical study of deep neural function approximation in reinforcement learning (RL)
We focus on the value based algorithm with the $epsilon$-greedy exploration via deep (and two-layer) neural networks endowed by Besov (and Barron) function spaces.
Our analysis reformulates the temporal difference error in an $L2(mathrmdmu)$-integrable space over a certain averaged measure $mu$, and transforms it to a generalization problem under the non-iid setting.
arXiv Detail & Related papers (2022-09-15T15:42:47Z) - Neural Network Approximation of Continuous Functions in High Dimensions
with Applications to Inverse Problems [6.84380898679299]
Current theory predicts that networks should scale exponentially in the dimension of the problem.
We provide a general method for bounding the complexity required for a neural network to approximate a H"older (or uniformly) continuous function.
arXiv Detail & Related papers (2022-08-28T22:44:07Z) - Unified Field Theory for Deep and Recurrent Neural Networks [56.735884560668985]
We present a unified and systematic derivation of the mean-field theory for both recurrent and deep networks.
We find that convergence towards the mean-field theory is typically slower for recurrent networks than for deep networks.
Our method exposes that Gaussian processes are but the lowest order of a systematic expansion in $1/n$.
arXiv Detail & Related papers (2021-12-10T15:06:11Z) - Deep neural network approximation of analytic functions [91.3755431537592]
entropy bound for the spaces of neural networks with piecewise linear activation functions.
We derive an oracle inequality for the expected error of the considered penalized deep neural network estimators.
arXiv Detail & Related papers (2021-04-05T18:02:04Z) - On Function Approximation in Reinforcement Learning: Optimism in the
Face of Large State Spaces [208.67848059021915]
We study the exploration-exploitation tradeoff at the core of reinforcement learning.
In particular, we prove that the complexity of the function class $mathcalF$ characterizes the complexity of the function.
Our regret bounds are independent of the number of episodes.
arXiv Detail & Related papers (2020-11-09T18:32:22Z) - On the Banach spaces associated with multi-layer ReLU networks: Function
representation, approximation theory and gradient descent dynamics [8.160343645537106]
We develop Banach spaces for ReLU neural networks of finite depth $L$ and infinite width.
The spaces contain all finite fully connected $L$-layer networks and their $L2$-limiting objects under on the natural path-norm.
Under this norm, the unit ball in the space for $L$-layer networks has low Rademacher complexity and thus favorable properties.
arXiv Detail & Related papers (2020-07-30T17:47:05Z) - Better Depth-Width Trade-offs for Neural Networks through the lens of
Dynamical Systems [24.229336600210015]
Recently, depth separation results for ReLU networks were obtained via a new connection with dynamical systems.
We improve the existing width lower bounds along several aspects.
A byproduct of our results is that there exists a universal constant characterizing the depth-width trade-offs.
arXiv Detail & Related papers (2020-03-02T11:36:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.