Related papers: Emergent weight morphologies in deep neural networks

Emergent weight morphologies in deep neural networks

URL: http://arxiv.org/abs/2501.05550v1
Date: Thu, 09 Jan 2025 19:48:51 GMT
Title: Emergent weight morphologies in deep neural networks
Authors: Pascal de Jong, Felix Meigel, Steffen Rulands,
Abstract summary: We show that training deep neural networks gives rise to emergent weight morphologies independent of the training data. Our work demonstrates emergence in the training of deep neural networks, which impacts the achievable performance of deep neural networks.
Score: 0.0
License:
Abstract: Whether deep neural networks can exhibit emergent behaviour is not only relevant for understanding how deep learning works, it is also pivotal for estimating potential security risks of increasingly capable artificial intelligence systems. Here, we show that training deep neural networks gives rise to emergent weight morphologies independent of the training data. Specifically, in analogy to condensed matter physics, we derive a theory that predict that the homogeneous state of deep neural networks is unstable in a way that leads to the emergence of periodic channel structures. We verified these structures by performing numerical experiments on a variety of data sets. Our work demonstrates emergence in the training of deep neural networks, which impacts the achievable performance of deep neural networks.

Related papers

Discovering Chunks in Neural Embeddings for Interpretability [53.80157905839065]
We propose leveraging the principle of chunking to interpret artificial neural population activities. We first demonstrate this concept in recurrent neural networks (RNNs) trained on artificial sequences with imposed regularities. We identify similar recurring embedding states corresponding to concepts in the input, with perturbations to these states activating or inhibiting the associated concepts.
arXiv Detail & Related papers (2025-02-03T20:30:46Z)
Collective variables of neural networks: empirical time evolution and scaling laws [0.535514140374842]
We show that certain measures on the spectrum of the empirical neural tangent kernel, specifically entropy and trace, yield insight into the representations learned by a neural network. Results are demonstrated first on test cases before being shown on more complex networks, including transformers, auto-encoders, graph neural networks, and reinforcement learning studies.
arXiv Detail & Related papers (2024-10-09T21:37:14Z)
Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks. We show that the networks acquire strong, data-dependent features. Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z)
Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence. We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers. This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
What can linearized neural networks actually say about generalization? [67.83999394554621]
In certain infinitely-wide neural networks, the neural tangent kernel (NTK) theory fully characterizes generalization. We show that the linear approximations can indeed rank the learning complexity of certain tasks for neural networks. Our work provides concrete examples of novel deep learning phenomena which can inspire future theoretical research.
arXiv Detail & Related papers (2021-06-12T13:05:11Z)
Explainable artificial intelligence for mechanics: physics-informing neural networks for constitutive models [0.0]
In mechanics, the new and active field of physics-informed neural networks attempts to mitigate this disadvantage by designing deep neural networks on the basis of mechanical knowledge. We propose a first step towards a physics-forming-in approach, which explains neural networks trained on mechanical data a posteriori. Therein, the principal component analysis decorrelates the distributed representations in cell states of RNNs and allows the comparison to known and fundamental functions.
arXiv Detail & Related papers (2021-04-20T18:38:52Z)
Learning Contact Dynamics using Physically Structured Neural Networks [81.73947303886753]
We use connections between deep neural networks and differential equations to design a family of deep network architectures for representing contact dynamics between objects. We show that these networks can learn discontinuous contact events in a data-efficient manner from noisy observations. Our results indicate that an idealised form of touch feedback is a key component of making this learning problem tractable.
arXiv Detail & Related papers (2021-02-22T17:33:51Z)
Mastering high-dimensional dynamics with Hamiltonian neural networks [0.0]
A map building perspective elucidates the superiority of Hamiltonian neural networks over conventional neural networks. The results clarify the critical relation between data, dimension, and neural network learning performance.
arXiv Detail & Related papers (2020-07-28T21:14:42Z)
Complexity for deep neural networks and other characteristics of deep feature representations [0.0]
We define a notion of complexity, which quantifies the nonlinearity of the computation of a neural network. We investigate these observables both for trained networks as well as explore their dynamics during training.
arXiv Detail & Related papers (2020-06-08T17:59:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.