Related papers: Eigenspace Restructuring: a Principle of Space and Frequency in Neural Networks

Eigenspace Restructuring: a Principle of Space and Frequency in Neural Networks

URL: http://arxiv.org/abs/2112.05611v1
Date: Fri, 10 Dec 2021 15:44:14 GMT
Title: Eigenspace Restructuring: a Principle of Space and Frequency in Neural Networks
Authors: Lechao Xiao
Abstract summary: We show that the eigenstructure of infinite-width multilayer perceptrons (MLPs) depends solely on the concept frequency. We show that the topologies from deep convolutional networks (CNNs) restructure the associated eigenspaces into finer subspaces. The resulting fine-grained eigenstructure dramatically improves the network's learnability.
Score: 11.480563447698172
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Understanding the fundamental principles behind the massive success of neural networks is one of the most important open questions in deep learning. However, due to the highly complex nature of the problem, progress has been relatively slow. In this note, through the lens of infinite-width networks, a.k.a. neural kernels, we present one such principle resulting from hierarchical localities. It is well-known that the eigenstructure of infinite-width multilayer perceptrons (MLPs) depends solely on the concept frequency, which measures the order of interactions. We show that the topologies from deep convolutional networks (CNNs) restructure the associated eigenspaces into finer subspaces. In addition to frequency, the new structure also depends on the concept space, which measures the spatial distance among nonlinear interaction terms. The resulting fine-grained eigenstructure dramatically improves the network's learnability, empowering them to simultaneously model a much richer class of interactions, including Long-Range-Low-Frequency interactions, Short-Range-High-Frequency interactions, and various interpolations and extrapolations in-between. Additionally, model scaling can improve the resolutions of interpolations and extrapolations and, therefore, the network's learnability. Finally, we prove a sharp characterization of the generalization error for infinite-width CNNs of any depth in the high-dimensional setting. Two corollaries follow: (1) infinite-width deep CNNs can break the curse of dimensionality without losing their expressivity, and (2) scaling improves performance in both the finite and infinite data regimes.

Related papers

Statistical Physics of Deep Neural Networks: Generalization Capability, Beyond the Infinite Width, and Feature Learning [0.0]
This thesis applies physics-based insights to understand Deep Neural Networks (DNNs) By understanding when a network must learn data structure, it sheds light on fostering meaningful internal representations.
arXiv Detail & Related papers (2025-01-31T16:43:57Z)
Topological Neural Networks: Mitigating the Bottlenecks of Graph Neural Networks via Higher-Order Interactions [1.994307489466967]
This work starts with a theoretical framework to reveal the impact of network's width, depth, and graph topology on the over-squashing phenomena in message-passing neural networks. The work drifts towards, higher-order interactions and multi-relational inductive biases via Topological Neural Networks. Inspired by Graph Attention Networks, two topological attention networks are proposed: Simplicial and Cell Attention Networks.
arXiv Detail & Related papers (2024-02-10T08:26:06Z)
Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence. We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers. This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z)
Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training. We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z)
Unified Field Theory for Deep and Recurrent Neural Networks [56.735884560668985]
We present a unified and systematic derivation of the mean-field theory for both recurrent and deep networks. We find that convergence towards the mean-field theory is typically slower for recurrent networks than for deep networks. Our method exposes that Gaussian processes are but the lowest order of a systematic expansion in $1/n$.
arXiv Detail & Related papers (2021-12-10T15:06:11Z)
Dual-constrained Deep Semi-Supervised Coupled Factorization Network with Enriched Prior [80.5637175255349]
We propose a new enriched prior based Dual-constrained Deep Semi-Supervised Coupled Factorization Network, called DS2CF-Net. To ex-tract hidden deep features, DS2CF-Net is modeled as a deep-structure and geometrical structure-constrained neural network. Our network can obtain state-of-the-art performance for representation learning and clustering.
arXiv Detail & Related papers (2020-09-08T13:10:21Z)
Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis. By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner. This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)
Generalization bound of globally optimal non-convex neural network training: Transportation map estimation by infinite dimensional Langevin dynamics [50.83356836818667]
We introduce a new theoretical framework to analyze deep learning optimization with connection to its generalization error. Existing frameworks such as mean field theory and neural tangent kernel theory for neural network optimization analysis typically require taking limit of infinite width of the network to show its global convergence.
arXiv Detail & Related papers (2020-07-11T18:19:50Z)
Doubly infinite residual neural networks: a diffusion process approach [8.642603456626393]
We show that deep ResNets do not suffer from undesirable forward-propagation properties. We focus on doubly infinite fully-connected ResNets, for which we consider i.i.d. Our results highlight a limited expressive power of doubly infinite ResNets when the unscaled network's parameters are i.i.d. and the residual blocks are shallow.
arXiv Detail & Related papers (2020-07-07T07:45:34Z)
Depth Enables Long-Term Memory for Recurrent Neural Networks [0.0]
We introduce a measure of the network's ability to support information flow across time, referred to as the Start-End separation rank. We prove that deep recurrent networks support Start-End separation ranks which are higher than those supported by their shallow counterparts.
arXiv Detail & Related papers (2020-03-23T10:29:14Z)
A Rigorous Framework for the Mean Field Limit of Multilayer Neural Networks [9.89901717499058]
We develop a mathematically rigorous framework for embedding neural networks in the mean field regime. As the network's widths increase, the network's learning trajectory is shown to be well captured by a limit. We prove several properties of large-width multilayer networks.
arXiv Detail & Related papers (2020-01-30T16:43:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.