Interplay between depth of neural networks and locality of target
functions
- URL: http://arxiv.org/abs/2201.12082v1
- Date: Fri, 28 Jan 2022 12:41:24 GMT
- Title: Interplay between depth of neural networks and locality of target
functions
- Authors: Takashi Mori, Masahito Ueda
- Abstract summary: We report a remarkable interplay between depth and locality of a target function.
We find that depth is beneficial for learning local functions but detrimental to learning global functions.
- Score: 5.33024001730262
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It has been recognized that heavily overparameterized deep neural networks
(DNNs) exhibit surprisingly good generalization performance in various
machine-learning tasks. Although benefits of depth have been investigated from
different perspectives such as the approximation theory and the statistical
learning theory, existing theories do not adequately explain the empirical
success of overparameterized DNNs. In this work, we report a remarkable
interplay between depth and locality of a target function. We introduce
$k$-local and $k$-global functions, and find that depth is beneficial for
learning local functions but detrimental to learning global functions. This
interplay is not properly captured by the neural tangent kernel, which
describes an infinitely wide neural network within the lazy learning regime.
Related papers
- Deep Neural Networks are Adaptive to Function Regularity and Data Distribution in Approximation and Estimation [8.284464143581546]
We study how deep neural networks can adapt to different regularity in functions across different locations and scales.
Our results show that deep neural networks are adaptive to different regularity of functions and nonuniform data distributions.
arXiv Detail & Related papers (2024-06-08T02:01:50Z) - Rank Diminishing in Deep Neural Networks [71.03777954670323]
Rank of neural networks measures information flowing across layers.
It is an instance of a key structural condition that applies across broad domains of machine learning.
For neural networks, however, the intrinsic mechanism that yields low-rank structures remains vague and unclear.
arXiv Detail & Related papers (2022-06-13T12:03:32Z) - Optimal Approximation with Sparse Neural Networks and Applications [0.0]
We use deep sparsely connected neural networks to measure the complexity of a function class in $L(mathbb Rd)$.
We also introduce representation system - a countable collection of functions to guide neural networks.
We then analyse the complexity of a class called $beta$ cartoon-like functions using rate-distortion theory and wedgelets construction.
arXiv Detail & Related papers (2021-08-14T05:14:13Z) - What can linearized neural networks actually say about generalization? [67.83999394554621]
In certain infinitely-wide neural networks, the neural tangent kernel (NTK) theory fully characterizes generalization.
We show that the linear approximations can indeed rank the learning complexity of certain tasks for neural networks.
Our work provides concrete examples of novel deep learning phenomena which can inspire future theoretical research.
arXiv Detail & Related papers (2021-06-12T13:05:11Z) - Learning Structures for Deep Neural Networks [99.8331363309895]
We propose to adopt the efficient coding principle, rooted in information theory and developed in computational neuroscience.
We show that sparse coding can effectively maximize the entropy of the output signals.
Our experiments on a public image classification dataset demonstrate that using the structure learned from scratch by our proposed algorithm, one can achieve a classification accuracy comparable to the best expert-designed structure.
arXiv Detail & Related papers (2021-05-27T12:27:24Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - The Connection Between Approximation, Depth Separation and Learnability
in Neural Networks [70.55686685872008]
We study the connection between learnability and approximation capacity.
We show that learnability with deep networks of a target function depends on the ability of simpler classes to approximate the target.
arXiv Detail & Related papers (2021-01-31T11:32:30Z) - Topological obstructions in neural networks learning [67.8848058842671]
We study global properties of the loss gradient function flow.
We use topological data analysis of the loss function and its Morse complex to relate local behavior along gradient trajectories with global properties of the loss surface.
arXiv Detail & Related papers (2020-12-31T18:53:25Z) - Is deeper better? It depends on locality of relevant features [5.33024001730262]
We investigate the effect of increasing the depth within an over parameterized regime.
Experiments show that deeper is better for local labels, whereas shallower is better for global labels.
It is shown that the neural kernel does not correctly capture the depth dependence of the generalization performance.
arXiv Detail & Related papers (2020-05-26T02:44:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.