Thalamus: a brain-inspired algorithm for biologically-plausible
continual learning and disentangled representations
- URL: http://arxiv.org/abs/2205.11713v1
- Date: Tue, 24 May 2022 01:29:21 GMT
- Title: Thalamus: a brain-inspired algorithm for biologically-plausible
continual learning and disentangled representations
- Authors: Ali Hummos
- Abstract summary: Animals thrive in a constantly changing environment and leverage the temporal structure to learn causal representations.
We introduce a simple algorithm that uses optimization at inference time to generate internal representations of temporal context.
We show that a network trained on a series of tasks using traditional weight updates can infer tasks dynamically.
We then alternate between the weight updates and the latent updates to arrive at Thalamus, a task-agnostic algorithm capable of discovering disentangled representations in a stream of unlabeled tasks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Animals thrive in a constantly changing environment and leverage the temporal
structure to learn well-factorized causal representations. In contrast,
traditional neural networks suffer from forgetting in changing environments and
many methods have been proposed to limit forgetting with different trade-offs.
Inspired by the brain thalamocortical circuit, we introduce a simple algorithm
that uses optimization at inference time to generate internal representations
of temporal context and to infer current context dynamically, allowing the
agent to parse the stream of temporal experience into discrete events and
organize learning about them. We show that a network trained on a series of
tasks using traditional weight updates can infer tasks dynamically using
gradient descent steps in the latent task embedding space (latent updates). We
then alternate between the weight updates and the latent updates to arrive at
Thalamus, a task-agnostic algorithm capable of discovering disentangled
representations in a stream of unlabeled tasks using simple gradient descent.
On a continual learning benchmark, it achieves competitive end average accuracy
and demonstrates knowledge transfer. After learning a subset of tasks it can
generalize to unseen tasks as they become reachable within the well-factorized
latent space, through one-shot latent updates. The algorithm meets many of the
desiderata of an ideal continually learning agent in open-ended environments,
and its simplicity suggests fundamental computations in circuits with abundant
feedback control loops such as the thalamocortical circuits in the brain.
Related papers
- Flexible task abstractions emerge in linear networks with fast and bounded units [47.11054206483159]
We analyze a linear gated network where the weights and gates are jointly optimized via gradient descent.
We show that the discovered task abstractions support generalization through both task and subtask composition.
Our work offers a theory of cognitive flexibility in animals as arising from joint gradient descent on synaptic and neural gating.
arXiv Detail & Related papers (2024-11-06T11:24:02Z) - Continual Learning via Sequential Function-Space Variational Inference [65.96686740015902]
We propose an objective derived by formulating continual learning as sequential function-space variational inference.
Compared to objectives that directly regularize neural network predictions, the proposed objective allows for more flexible variational distributions.
We demonstrate that, across a range of task sequences, neural networks trained via sequential function-space variational inference achieve better predictive accuracy than networks trained with related methods.
arXiv Detail & Related papers (2023-12-28T18:44:32Z) - Understanding Activation Patterns in Artificial Neural Networks by
Exploring Stochastic Processes [0.0]
We propose utilizing the framework of processes, which has been underutilized thus far.
We focus solely on activation frequency, leveraging neuroscience techniques used for real neuron spike trains.
We derive parameters describing activation patterns in each network, revealing consistent differences across architectures and training sets.
arXiv Detail & Related papers (2023-08-01T22:12:30Z) - The Clock and the Pizza: Two Stories in Mechanistic Explanation of
Neural Networks [59.26515696183751]
We show that algorithm discovery in neural networks is sometimes more complex.
We show that even simple learning problems can admit a surprising diversity of solutions.
arXiv Detail & Related papers (2023-06-30T17:59:13Z) - IF2Net: Innately Forgetting-Free Networks for Continual Learning [49.57495829364827]
Continual learning can incrementally absorb new concepts without interfering with previously learned knowledge.
Motivated by the characteristics of neural networks, we investigated how to design an Innately Forgetting-Free Network (IF2Net)
IF2Net allows a single network to inherently learn unlimited mapping rules without telling task identities at test time.
arXiv Detail & Related papers (2023-06-18T05:26:49Z) - Learning to Modulate Random Weights: Neuromodulation-inspired Neural
Networks For Efficient Continual Learning [1.9580473532948401]
We introduce a novel neural network architecture inspired by neuromodulation in biological nervous systems.
We show that this approach has strong learning performance per task despite the very small number of learnable parameters.
arXiv Detail & Related papers (2022-04-08T21:12:13Z) - On Generalizing Beyond Domains in Cross-Domain Continual Learning [91.56748415975683]
Deep neural networks often suffer from catastrophic forgetting of previously learned knowledge after learning a new task.
Our proposed approach learns new tasks under domain shift with accuracy boosts up to 10% on challenging datasets such as DomainNet and OfficeHome.
arXiv Detail & Related papers (2022-03-08T09:57:48Z) - Natural continual learning: success is a journey, not (just) a
destination [9.462808515258464]
Natural Continual Learning (NCL) is a new method that unifies weight regularization and projected gradient descent.
Our method outperforms both standard weight regularization techniques and projection based approaches when applied to continual learning problems in RNNs.
The trained networks evolve task-specific dynamics that are strongly preserved as new tasks are learned, similar to experimental findings in biological circuits.
arXiv Detail & Related papers (2021-06-15T12:24:53Z) - Faster Biological Gradient Descent Learning [0.0]
Back-propagation is a popular machine learning algorithm that uses gradient descent in training neural networks for supervised learning.
We have come up with a simple and local gradient descent optimization algorithm that can reduce training time.
Our algorithm is found to speed up learning, particularly for small networks.
arXiv Detail & Related papers (2020-09-27T05:26:56Z) - Incremental Training of a Recurrent Neural Network Exploiting a
Multi-Scale Dynamic Memory [79.42778415729475]
We propose a novel incrementally trained recurrent architecture targeting explicitly multi-scale learning.
We show how to extend the architecture of a simple RNN by separating its hidden state into different modules.
We discuss a training algorithm where new modules are iteratively added to the model to learn progressively longer dependencies.
arXiv Detail & Related papers (2020-06-29T08:35:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.