Related papers: A developmental approach for training deep belief networks

A developmental approach for training deep belief networks

URL: http://arxiv.org/abs/2207.05473v1
Date: Tue, 12 Jul 2022 11:37:58 GMT
Title: A developmental approach for training deep belief networks
Authors: Matteo Zambra, Alberto Testolin, Michele De Filippo De Grazia, Marco Zorzi
Abstract summary: Deep belief networks (DBNs) are neural networks that can extract rich internal representations of the environment from the sensory data. We present iDBN, an iterative learning algorithm for DBNs that allows to jointly update the connection weights across all layers of the hierarchy. Our work paves the way to the use of iDBN for modeling neurocognitive development.
Score: 0.46699574490885926
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Deep belief networks (DBNs) are stochastic neural networks that can extract rich internal representations of the environment from the sensory data. DBNs had a catalytic effect in triggering the deep learning revolution, demonstrating for the very first time the feasibility of unsupervised learning in networks with many layers of hidden neurons. Thanks to their biological and cognitive plausibility, these hierarchical architectures have been also successfully exploited to build computational models of human perception and cognition in a variety of domains. However, learning in DBNs is usually carried out in a greedy, layer-wise fashion, which does not allow to simulate the holistic development of cortical circuits. Here we present iDBN, an iterative learning algorithm for DBNs that allows to jointly update the connection weights across all layers of the hierarchy. We test our algorithm on two different sets of visual stimuli, and we show that network development can also be tracked in terms of graph theoretical properties. DBNs trained using our iterative approach achieve a final performance comparable to that of the greedy counterparts, at the same time allowing to accurately analyze the gradual development of internal representations in the generative model. Our work paves the way to the use of iDBN for modeling neurocognitive development.

Related papers

KPFlow: An Operator Perspective on Dynamic Collapse Under Gradient Descent Training of Recurrent Networks [9.512147747894026]
We show how a gradient flow can be decomposed into a product that involves two operators.<n>We show how their interplay gives rise to low-dimensional latent dynamics under GD.<n>For multi-task training, we show that the operators can be used to measure how objectives relevant to individual sub-tasks align.
arXiv Detail & Related papers (2025-07-08T20:33:15Z)
Concept-Guided Interpretability via Neural Chunking [54.73787666584143]
We show that neural networks exhibit patterns in their raw population activity that mirror regularities in the training data.<n>We propose three methods to extract these emerging entities, complementing each other based on label availability and dimensionality.<n>Our work points to a new direction for interpretability, one that harnesses both cognitive principles and the structure of naturalistic data.
arXiv Detail & Related papers (2025-05-16T13:49:43Z)
Discovering Chunks in Neural Embeddings for Interpretability [53.80157905839065]
We propose leveraging the principle of chunking to interpret artificial neural population activities. We first demonstrate this concept in recurrent neural networks (RNNs) trained on artificial sequences with imposed regularities. We identify similar recurring embedding states corresponding to concepts in the input, with perturbations to these states activating or inhibiting the associated concepts.
arXiv Detail & Related papers (2025-02-03T20:30:46Z)
Graph-Based Representation Learning of Neuronal Dynamics and Behavior [2.3859858429583665]
We introduce the Temporal Attention-enhanced Variational Graph Recurrent Neural Network (TAVRNN), a novel framework that models time-varying neuronal connectivity.<n>TAVRNN learns latent dynamics at the single-unit level while maintaining interpretable population-level representations.<n>We validate TAVRNN on three diverse datasets: (1) electrophysiological data from a freely behaving rat, (2) primate somatosensory cortex recordings during a reaching task, and (3) biological neurons in the DishBrain platform interacting with a virtual game environment.
arXiv Detail & Related papers (2024-10-01T13:19:51Z)
Peer-to-Peer Learning Dynamics of Wide Neural Networks [10.179711440042123]
We provide an explicit, non-asymptotic characterization of the learning dynamics of wide neural networks trained using popularDGD algorithms. We validate our analytical results by accurately predicting error and error and for classification tasks.
arXiv Detail & Related papers (2024-09-23T17:57:58Z)
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks [47.13391046553908]
In artificial networks, the effectiveness of these models relies on their ability to build task specific representation. Prior studies highlight that different initializations can place networks in either a lazy regime, where representations remain static, or a rich/feature learning regime, where representations evolve dynamically. These solutions capture the evolution of representations and the Neural Kernel across the spectrum from the rich to the lazy regimes.
arXiv Detail & Related papers (2024-09-22T23:19:04Z)
Contrastive Learning in Memristor-based Neuromorphic Systems [55.11642177631929]
Spiking neural networks have become an important family of neuron-based models that sidestep many of the key limitations facing modern-day backpropagation-trained deep networks. In this work, we design and investigate a proof-of-concept instantiation of contrastive-signal-dependent plasticity (CSDP), a neuromorphic form of forward-forward-based, backpropagation-free learning.
arXiv Detail & Related papers (2024-09-17T04:48:45Z)
The Dynamic Net Architecture: Learning Robust and Holistic Visual Representations Through Self-Organizing Networks [3.9848584845601014]
We present a novel intelligent-system architecture called "Dynamic Net Architecture" (DNA) DNA relies on recurrence-stabilized networks and discuss it in application to vision.
arXiv Detail & Related papers (2024-07-08T06:22:10Z)
Unsupervised representation learning with Hebbian synaptic and structural plasticity in brain-like feedforward neural networks [0.0]
We introduce and evaluate a brain-like neural network model capable of unsupervised representation learning. The model was tested on a diverse set of popular machine learning benchmarks.
arXiv Detail & Related papers (2024-06-07T08:32:30Z)
Contrastive-Signal-Dependent Plasticity: Self-Supervised Learning in Spiking Neural Circuits [61.94533459151743]
This work addresses the challenge of designing neurobiologically-motivated schemes for adjusting the synapses of spiking networks. Our experimental simulations demonstrate a consistent advantage over other biologically-plausible approaches when training recurrent spiking networks.
arXiv Detail & Related papers (2023-03-30T02:40:28Z)
Identifying Equivalent Training Dynamics [3.793387630509845]
We develop a framework for identifying conjugate and non-conjugate training dynamics. By leveraging advances in Koopman operator theory, we demonstrate that comparing Koopman eigenvalues can correctly identify a known equivalence between online mirror descent and online gradient descent. We then utilize our approach to: (a) identify non-conjugate training dynamics between shallow and wide fully connected neural networks; (b) characterize the early phase of training dynamics in convolutional neural networks; (c) uncover non-conjugate training dynamics in Transformers that do and do not undergo grokking.
arXiv Detail & Related papers (2023-02-17T22:15:20Z)
Developing hierarchical anticipations via neural network-based event segmentation [14.059479351946386]
We model the development of hierarchical predictions via autonomously learned latent event codes. We present a hierarchical recurrent neural network architecture, whose inductive learning biases foster the development of sparsely changing latent state. A higher level network learns to predict the situations in which the latent states tend to change.
arXiv Detail & Related papers (2022-06-04T18:54:31Z)
Quasi-orthogonality and intrinsic dimensions as measures of learning and generalisation [55.80128181112308]
We show that dimensionality and quasi-orthogonality of neural networks' feature space may jointly serve as network's performance discriminants. Our findings suggest important relationships between the networks' final performance and properties of their randomly initialised feature spaces.
arXiv Detail & Related papers (2022-03-30T21:47:32Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Backprop-Free Reinforcement Learning with Active Neural Generative Coding [84.11376568625353]
We propose a computational framework for learning action-driven generative models without backpropagation of errors (backprop) in dynamic environments. We develop an intelligent agent that operates even with sparse rewards, drawing inspiration from the cognitive theory of planning as inference. The robust performance of our agent offers promising evidence that a backprop-free approach for neural inference and learning can drive goal-directed behavior.
arXiv Detail & Related papers (2021-07-10T19:02:27Z)
Towards a Predictive Processing Implementation of the Common Model of Cognition [79.63867412771461]
We describe an implementation of the common model of cognition grounded in neural generative coding and holographic associative memory. The proposed system creates the groundwork for developing agents that learn continually from diverse tasks as well as model human performance at larger scales.
arXiv Detail & Related papers (2021-05-15T22:55:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.