Related papers: What should a neuron aim for? Designing local objective functions based on information theory

What should a neuron aim for? Designing local objective functions based on information theory

URL: http://arxiv.org/abs/2412.02482v3
Date: Tue, 21 Jan 2025 09:46:38 GMT
Title: What should a neuron aim for? Designing local objective functions based on information theory
Authors: Andreas C. Schneider, Valentin Neuhaus, David A. Ehrlich, Abdullah Makkeh, Alexander S. Ecker, Viola Priesemann, Michael Wibral,
Abstract summary: We show how self-organized artificial neurons can be achieved by designing bio-inspired local learning goals.<n>These goals are parameterized using a recent extension of information theory, Partial Information Decomposition.<n>Our work advances a principled information-theoretic foundation for local learning strategies.
Score: 41.39714023784306
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In modern deep neural networks, the learning dynamics of the individual neurons is often obscure, as the networks are trained via global optimization. Conversely, biological systems build on self-organized, local learning, achieving robustness and efficiency with limited global information. We here show how self-organization between individual artificial neurons can be achieved by designing abstract bio-inspired local learning goals. These goals are parameterized using a recent extension of information theory, Partial Information Decomposition (PID), which decomposes the information that a set of information sources holds about an outcome into unique, redundant and synergistic contributions. Our framework enables neurons to locally shape the integration of information from various input classes, i.e. feedforward, feedback, and lateral, by selecting which of the three inputs should contribute uniquely, redundantly or synergistically to the output. This selection is expressed as a weighted sum of PID terms, which, for a given problem, can be directly derived from intuitive reasoning or via numerical optimization, offering a window into understanding task-relevant local information processing. Achieving neuron-level interpretability while enabling strong performance using local learning, our work advances a principled information-theoretic foundation for local learning strategies.

Related papers

Switch-Based Multi-Part Neural Network [0.15749416770494706]
Decentralized and modular neural network framework designed to enhance the scalability, interpretability, and performance of AI systems. At the heart of this framework is a dynamic switch mechanism that governs the selective activation and training of individual neurons.
arXiv Detail & Related papers (2025-04-25T10:39:42Z)
Global Convergence and Rich Feature Learning in $L$-Layer Infinite-Width Neural Networks under $μ$P Parametrization [66.03821840425539]
In this paper, we investigate the training dynamics of $L$-layer neural networks using the tensor gradient program (SGD) framework. We show that SGD enables these networks to learn linearly independent features that substantially deviate from their initial values. This rich feature space captures relevant data information and ensures that any convergent point of the training process is a global minimum.
arXiv Detail & Related papers (2025-03-12T17:33:13Z)
A Unified Framework for Neural Computation and Learning Over Time [56.44910327178975]
Hamiltonian Learning is a novel unified framework for learning with neural networks "over time" It is based on differential equations that: (i) can be integrated without the need of external software solvers; (ii) generalize the well-established notion of gradient-based learning in feed-forward and recurrent networks; (iii) open to novel perspectives.
arXiv Detail & Related papers (2024-09-18T14:57:13Z)
A General Framework for Interpretable Neural Learning based on Local Information-Theoretic Goal Functions [1.5236380958983644]
We introduce 'infomorphic' neural networks to perform tasks from supervised, unsupervised and memory learning. By leveraging the interpretable nature of the PID framework, infomorphic networks represent a valuable tool to advance our understanding of the intricate structure of local learning.
arXiv Detail & Related papers (2023-06-03T16:34:25Z)
Language Knowledge-Assisted Representation Learning for Skeleton-Based Action Recognition [71.35205097460124]
How humans understand and recognize the actions of others is a complex neuroscientific problem. LA-GCN proposes a graph convolution network using large-scale language models (LLM) knowledge assistance.
arXiv Detail & Related papers (2023-05-21T08:29:16Z)
Redundancy and Concept Analysis for Code-trained Language Models [5.726842555987591]
Code-trained language models have proven to be highly effective for various code intelligence tasks. They can be challenging to train and deploy for many software engineering applications due to computational bottlenecks and memory constraints. We perform the first neuron-level analysis for source code models to identify textitimportant neurons within latent representations.
arXiv Detail & Related papers (2023-05-01T15:22:41Z)
To Compress or Not to Compress- Self-Supervised Learning and Information Theory: A Review [30.87092042943743]
Deep neural networks excel in supervised learning tasks but are constrained by the need for extensive labeled data. Self-supervised learning emerges as a promising alternative, allowing models to learn without explicit labels. Information theory, and notably the information bottleneck principle, has been pivotal in shaping deep neural networks.
arXiv Detail & Related papers (2023-04-19T00:33:59Z)
Synergistic information supports modality integration and flexible learning in neural networks solving multiple tasks [107.8565143456161]
We investigate the information processing strategies adopted by simple artificial neural networks performing a variety of cognitive tasks. Results show that synergy increases as neural networks learn multiple diverse tasks. randomly turning off neurons during training through dropout increases network redundancy, corresponding to an increase in robustness.
arXiv Detail & Related papers (2022-10-06T15:36:27Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
FF-NSL: Feed-Forward Neural-Symbolic Learner [70.978007919101]
This paper introduces a neural-symbolic learning framework, called Feed-Forward Neural-Symbolic Learner (FF-NSL) FF-NSL integrates state-of-the-art ILP systems based on the Answer Set semantics, with neural networks, in order to learn interpretable hypotheses from labelled unstructured data.
arXiv Detail & Related papers (2021-06-24T15:38:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.