Dynamically Modular and Sparse General Continual Learning
- URL: http://arxiv.org/abs/2301.00620v1
- Date: Mon, 2 Jan 2023 12:24:24 GMT
- Title: Dynamically Modular and Sparse General Continual Learning
- Authors: Arnav Varma, Elahe Arani and Bahram Zonooz
- Abstract summary: We introduce dynamic modularity and sparsity (Dynamos) for rehearsal-based general continual learning.
We show that our method learns representations that are modular and specialized, while maintaining reusability by activating subsets of neurons with overlaps corresponding to the similarity of stimuli.
- Score: 13.976220447055521
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Real-world applications often require learning continuously from a stream of
data under ever-changing conditions. When trying to learn from such
non-stationary data, deep neural networks (DNNs) undergo catastrophic
forgetting of previously learned information. Among the common approaches to
avoid catastrophic forgetting, rehearsal-based methods have proven effective.
However, they are still prone to forgetting due to task-interference as all
parameters respond to all tasks. To counter this, we take inspiration from
sparse coding in the brain and introduce dynamic modularity and sparsity
(Dynamos) for rehearsal-based general continual learning. In this setup, the
DNN learns to respond to stimuli by activating relevant subsets of neurons. We
demonstrate the effectiveness of Dynamos on multiple datasets under challenging
continual learning evaluation protocols. Finally, we show that our method
learns representations that are modular and specialized, while maintaining
reusability by activating subsets of neurons with overlaps corresponding to the
similarity of stimuli.
Related papers
- Meta-Dynamical State Space Models for Integrative Neural Data Analysis [8.625491800829224]
Learning shared structure across environments facilitates rapid learning and adaptive behavior in neural systems.
There has been limited work exploiting the shared structure in neural activity during similar tasks for learning latent dynamics from neural recordings.
We propose a novel approach for meta-learning this solution space from task-related neural activity of trained animals.
arXiv Detail & Related papers (2024-10-07T19:35:49Z) - Trainability, Expressivity and Interpretability in Gated Neural ODEs [0.0]
We introduce a novel measure of expressivity which probes the capacity of a neural network to generate complex trajectories.
We show how reduced-dimensional gnODEs retain their modeling power while greatly improving interpretability.
We also demonstrate the benefit of gating in nODEs on several real-world tasks.
arXiv Detail & Related papers (2023-07-12T18:29:01Z) - Learning Latent Dynamics via Invariant Decomposition and
(Spatio-)Temporal Transformers [0.6767885381740952]
We propose a method for learning dynamical systems from high-dimensional empirical data.
We focus on the setting in which data are available from multiple different instances of a system.
We study behaviour through simple theoretical analyses and extensive experiments on synthetic and real-world datasets.
arXiv Detail & Related papers (2023-06-21T07:52:07Z) - Measures of Information Reflect Memorization Patterns [53.71420125627608]
We show that the diversity in the activation patterns of different neurons is reflective of model generalization and memorization.
Importantly, we discover that information organization points to the two forms of memorization, even for neural activations computed on unlabelled in-distribution examples.
arXiv Detail & Related papers (2022-10-17T20:15:24Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - A Meta-Learned Neuron model for Continual Learning [0.0]
Continual learning is the ability to acquire new knowledge without forgetting the previously learned one.
In this work, we replace the standard neuron by a meta-learned neuron model.
Our approach can memorize dataset-length sequences of training samples, and its learning capabilities generalize to any domain.
arXiv Detail & Related papers (2021-11-03T23:39:14Z) - PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive
Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context.
We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z) - Deep Recurrent Encoder: A scalable end-to-end network to model brain
signals [122.1055193683784]
We propose an end-to-end deep learning architecture trained to predict the brain responses of multiple subjects at once.
We successfully test this approach on a large cohort of magnetoencephalography (MEG) recordings acquired during a one-hour reading task.
arXiv Detail & Related papers (2021-03-03T11:39:17Z) - Incremental Training of a Recurrent Neural Network Exploiting a
Multi-Scale Dynamic Memory [79.42778415729475]
We propose a novel incrementally trained recurrent architecture targeting explicitly multi-scale learning.
We show how to extend the architecture of a simple RNN by separating its hidden state into different modules.
We discuss a training algorithm where new modules are iteratively added to the model to learn progressively longer dependencies.
arXiv Detail & Related papers (2020-06-29T08:35:49Z) - Automatic Recall Machines: Internal Replay, Continual Learning and the
Brain [104.38824285741248]
Replay in neural networks involves training on sequential data with memorized samples, which counteracts forgetting of previous behavior caused by non-stationarity.
We present a method where these auxiliary samples are generated on the fly, given only the model that is being trained for the assessed objective.
Instead the implicit memory of learned samples within the assessed model itself is exploited.
arXiv Detail & Related papers (2020-06-22T15:07:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.