Related papers: Revisiting Deep Information Propagation: Fractal Frontier and Finite-size Effects

Revisiting Deep Information Propagation: Fractal Frontier and Finite-size Effects

URL: http://arxiv.org/abs/2508.03222v1
Date: Tue, 05 Aug 2025 08:49:24 GMT
Title: Revisiting Deep Information Propagation: Fractal Frontier and Finite-size Effects
Authors: Giuseppe Alessio D'Inverno, Zhiyuan Hu, Leo Davy, Michael Unser, Gianluigi Rozza, Jonathan Dong,
Abstract summary: We study information propagation in randomly neural networks with finite width and reveal that the boundary between ordered and chaotic regimes exhibits a fractal structure.<n>This shows the fundamental complexity of neural network dynamics, in a setting that is independent of input data and optimization.
Score: 16.42026346710591
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Information propagation characterizes how input correlations evolve across layers in deep neural networks. This framework has been well studied using mean-field theory, which assumes infinitely wide networks. However, these assumptions break down for practical, finite-size networks. In this work, we study information propagation in randomly initialized neural networks with finite width and reveal that the boundary between ordered and chaotic regimes exhibits a fractal structure. This shows the fundamental complexity of neural network dynamics, in a setting that is independent of input data and optimization. To extend this analysis beyond multilayer perceptrons, we leverage recently introduced Fourier-based structured transforms, and show that information propagation in convolutional neural networks also follow the same behavior. Our investigation highlights the importance of finite network depth with respect to the tradeoff between separation and robustness.

Related papers

Collective variables of neural networks: empirical time evolution and scaling laws [0.535514140374842]
We show that certain measures on the spectrum of the empirical neural tangent kernel, specifically entropy and trace, yield insight into the representations learned by a neural network. Results are demonstrated first on test cases before being shown on more complex networks, including transformers, auto-encoders, graph neural networks, and reinforcement learning studies.
arXiv Detail & Related papers (2024-10-09T21:37:14Z)
Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence. We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers. This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z)
Variation Spaces for Multi-Output Neural Networks: Insights on Multi-Task Learning and Network Compression [28.851519959657466]
This paper introduces a novel theoretical framework for the analysis of vector-valued neural networks. A key contribution of this work is the development of a representer theorem for the vector-valued variation spaces. This observation reveals that the norm associated with these vector-valued variation spaces encourages the learning of features that are useful for multiple tasks.
arXiv Detail & Related papers (2023-05-25T23:32:10Z)
Bayesian Interpolation with Deep Linear Networks [92.1721532941863]
Characterizing how neural network depth, width, and dataset size jointly impact model quality is a central problem in deep learning theory. We show that linear networks make provably optimal predictions at infinite depth. We also show that with data-agnostic priors, Bayesian model evidence in wide linear networks is maximized at infinite depth.
arXiv Detail & Related papers (2022-12-29T20:57:46Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Understanding Convolutional Neural Networks from Theoretical Perspective via Volterra Convolution [22.058311878382142]
This study explores the relationship between convolutional neural networks and finite Volterra convolutions. It provides a novel approach to explain and study the overall characteristics of neural networks without being disturbed by the complex network architectures.
arXiv Detail & Related papers (2021-10-19T12:07:46Z)
The Principles of Deep Learning Theory [19.33681537640272]
This book develops an effective theory approach to understanding deep neural networks of practical relevance. We explain how these effectively-deep networks learn nontrivial representations from training. We show that the depth-to-width ratio governs the effective model complexity of the ensemble of trained networks.
arXiv Detail & Related papers (2021-06-18T15:00:00Z)
Learning Structures for Deep Neural Networks [99.8331363309895]
We propose to adopt the efficient coding principle, rooted in information theory and developed in computational neuroscience. We show that sparse coding can effectively maximize the entropy of the output signals. Our experiments on a public image classification dataset demonstrate that using the structure learned from scratch by our proposed algorithm, one can achieve a classification accuracy comparable to the best expert-designed structure.
arXiv Detail & Related papers (2021-05-27T12:27:24Z)
Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis. By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner. This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)
Network Diffusions via Neural Mean-Field Dynamics [52.091487866968286]
We propose a novel learning framework for inference and estimation problems of diffusion on networks. Our framework is derived from the Mori-Zwanzig formalism to obtain an exact evolution of the node infection probabilities. Our approach is versatile and robust to variations of the underlying diffusion network models.
arXiv Detail & Related papers (2020-06-16T18:45:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.