Emergence of Structure in Ensembles of Random Neural Networks
- URL: http://arxiv.org/abs/2505.10331v1
- Date: Thu, 15 May 2025 14:20:02 GMT
- Title: Emergence of Structure in Ensembles of Random Neural Networks
- Authors: Luca Muscarnera, Luigi Loreti, Giovanni Todeschini, Alessio Fumagalli, Francesco Regazzoni,
- Abstract summary: We introduce a theoretical model for studying the emergence of collective behaviors in ensembles of random classifiers.<n>Experiments on the MNIST dataset underline the relevance of this phenomenon in high-quality, noiseless, datasets.
- Score: 3.3385430106181184
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Randomness is ubiquitous in many applications across data science and machine learning. Remarkably, systems composed of random components often display emergent global behaviors that appear deterministic, manifesting a transition from microscopic disorder to macroscopic organization. In this work, we introduce a theoretical model for studying the emergence of collective behaviors in ensembles of random classifiers. We argue that, if the ensemble is weighted through the Gibbs measure defined by adopting the classification loss as an energy, then there exists a finite temperature parameter for the distribution such that the classification is optimal, with respect to the loss (or the energy). Interestingly, for the case in which samples are generated by a Gaussian distribution and labels are constructed by employing a teacher perceptron, we analytically prove and numerically confirm that such optimal temperature does not depend neither on the teacher classifier (which is, by construction of the learning problem, unknown), nor on the number of random classifiers, highlighting the universal nature of the observed behavior. Experiments on the MNIST dataset underline the relevance of this phenomenon in high-quality, noiseless, datasets. Finally, a physical analogy allows us to shed light on the self-organizing nature of the studied phenomenon.
Related papers
- Causal Representation Meets Stochastic Modeling under Generic Geometry [49.24293444627916]
We develop causal representation learning for continuous-time latent point processes.<n>We develop MUTATE, an identifiable variational autoencoder framework with a time-adaptive transition module.<n>Across simulated and empirical studies, we find that MUTATE can effectively answer scientific questions.
arXiv Detail & Related papers (2026-02-04T20:40:53Z) - Theory of Speciation Transitions in Diffusion Models with General Class Structure [5.939780039158003]
Diffusion models generate data by reversing a diffusion process, transforming noise into structured samples drawn from a target distribution.<n>Recent theoretical work has shown that this backward dynamics can undergo sharp qualitative transitions, known as speciation transitions.<n>We develop a general theory of speciation in diffusion models that applies to arbitrary target distributions admitting well-defined classes.
arXiv Detail & Related papers (2026-02-04T10:35:47Z) - Random-Matrix-Induced Simplicity Bias in Over-parameterized Variational Quantum Circuits [72.0643009153473]
We show that expressive variational ansatze enter a Haar-like universality class in which both observable expectation values and parameter gradients concentrate exponentially with system size.<n>As a consequence, the hypothesis class induced by such circuits collapses with high probability to a narrow family of near-constant functions.<n>We further show that this collapse is not unavoidable: tensor-structured VQCs, including tensor-network-based and tensor-hypernetwork parameterizations, lie outside the Haar-like universality class.
arXiv Detail & Related papers (2026-01-05T08:04:33Z) - Chaos, Entanglement and Measurement: Field-Theoretic Perspectives on Quantum Information Dynamics [0.0]
I study scrambling and pseudorandomness in the Brownian Sachdev-Ye-Kitaev (SYK) model.<n>I construct a field theory for weakly measured SYK clusters.<n>I develop a strong-disorder renormalization group for measurement-only SYK clusters.
arXiv Detail & Related papers (2025-12-11T10:04:30Z) - Free Independence and Unitary Design from Random Matrix Product Unitaries [0.0]
We study the emergence of freeness from a random matrix product unitary ensemble.<n>We prove that, with only bond dimension, these unitaries reproduce Haar values of higher-order OTOCs for local, finite-trace observables.<n>Our results highlight the need to refine previous notions of unitary designs in the context of operator dynamics.
arXiv Detail & Related papers (2025-07-31T18:00:00Z) - Rademacher learning rates for iterated random functions [0.0]
We consider the case where the training dataset is generated by an iterated random function that is not necessarily irreducible or aperiodic.<n>Under the assumption that the governing function is contractive with respect to its first argument, we first establish a uniform convergence result for the corresponding sample error.<n>We then demonstrate the learnability of the approximate empirical risk minimization algorithm and derive its learning rate bound.
arXiv Detail & Related papers (2025-06-16T19:36:13Z) - Identifiable Latent Neural Causal Models [82.14087963690561]
Causal representation learning seeks to uncover latent, high-level causal representations from low-level observed data.
We determine the types of distribution shifts that do contribute to the identifiability of causal representations.
We translate our findings into a practical algorithm, allowing for the acquisition of reliable latent causal representations.
arXiv Detail & Related papers (2024-03-23T04:13:55Z) - Learning Linear Causal Representations from Interventions under General
Nonlinear Mixing [52.66151568785088]
We prove strong identifiability results given unknown single-node interventions without access to the intervention targets.
This is the first instance of causal identifiability from non-paired interventions for deep neural network embeddings.
arXiv Detail & Related papers (2023-06-04T02:32:12Z) - DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained
Diffusion [66.21290235237808]
We introduce an energy constrained diffusion model which encodes a batch of instances from a dataset into evolutionary states.
We provide rigorous theory that implies closed-form optimal estimates for the pairwise diffusion strength among arbitrary instance pairs.
Experiments highlight the wide applicability of our model as a general-purpose encoder backbone with superior performance in various tasks.
arXiv Detail & Related papers (2023-01-23T15:18:54Z) - Characterization and Greedy Learning of Gaussian Structural Causal Models under Unknown Interventions [4.993565079216378]
We use a greedy algorithm called GnIES to recover the equivalence class of the data-generating model without knowledge of the intervention targets.<n>In addition, we develop a novel procedure to generate semi-synthetic data sets with known causal ground truth but distributions closely resembling those of a real data set of choice.
arXiv Detail & Related papers (2022-11-27T17:37:21Z) - Identifiability and Asymptotics in Learning Homogeneous Linear ODE Systems from Discrete Observations [114.17826109037048]
Ordinary Differential Equations (ODEs) have recently gained a lot of attention in machine learning.
theoretical aspects, e.g., identifiability and properties of statistical estimation are still obscure.
This paper derives a sufficient condition for the identifiability of homogeneous linear ODE systems from a sequence of equally-spaced error-free observations sampled from a single trajectory.
arXiv Detail & Related papers (2022-10-12T06:46:38Z) - Gaussian Universality of Linear Classifiers with Random Labels in
High-Dimension [24.503842578208268]
We prove that data coming from a range of generative models in high-dimensions have the same minimum training loss as Gaussian data with corresponding data covariance.
In particular, our theorem covers data created by an arbitrary mixture of homogeneous Gaussian clouds, as well as multi-modal generative neural networks.
arXiv Detail & Related papers (2022-05-26T12:25:24Z) - Fluctuations, Bias, Variance & Ensemble of Learners: Exact Asymptotics
for Convex Losses in High-Dimension [25.711297863946193]
We develop a theory for the study of fluctuations in an ensemble of generalised linear models trained on different, but correlated, features.
We provide a complete description of the joint distribution of the empirical risk minimiser for generic convex loss and regularisation in the high-dimensional limit.
arXiv Detail & Related papers (2022-01-31T17:44:58Z) - Uniform Convergence, Adversarial Spheres and a Simple Remedy [40.44709296304123]
Previous work has cast doubt on the general framework of uniform convergence and its ability to explain generalization in neural networks.
We provide an extensive theoretical investigation of the previously studied data setting through the lens of infinitely-wide models.
We prove that the Neural Tangent Kernel (NTK) also suffers from the same phenomenon and we uncover its origin.
arXiv Detail & Related papers (2021-05-07T20:23:01Z) - Leveraging Global Parameters for Flow-based Neural Posterior Estimation [90.21090932619695]
Inferring the parameters of a model based on experimental observations is central to the scientific method.
A particularly challenging setting is when the model is strongly indeterminate, i.e., when distinct sets of parameters yield identical observations.
We present a method for cracking such indeterminacy by exploiting additional information conveyed by an auxiliary set of observations sharing global parameters.
arXiv Detail & Related papers (2021-02-12T12:23:13Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - Neural Networks and Polynomial Regression. Demystifying the
Overparametrization Phenomena [17.205106391379026]
In the context of neural network models, overparametrization refers to the phenomena whereby these models appear to generalize well on the unseen data.
A conventional explanation of this phenomena is based on self-regularization properties of algorithms used to train the data.
We show that any student network interpolating the data generated by a teacher network generalizes well, provided that the sample size is at least an explicit quantity controlled by data dimension.
arXiv Detail & Related papers (2020-03-23T20:09:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.