Modelling continual learning in humans with Hebbian context gating and
exponentially decaying task signals
- URL: http://arxiv.org/abs/2203.11560v1
- Date: Tue, 22 Mar 2022 09:32:06 GMT
- Title: Modelling continual learning in humans with Hebbian context gating and
exponentially decaying task signals
- Authors: Timo Flesch, David G. Nagy, Andrew Saxe, Christopher Summerfield
- Abstract summary: Humans can learn several tasks in succession with minimal mutual interference but perform more poorly when trained on multiple tasks at once.
We propose novel computational constraints for artificial neural networks, that capture the cost of interleaved training and allow the network to learn two tasks in sequence without forgetting.
We found that the "sluggish" units introduce a switch-cost during training, which biases representations under interleaved training towards a joint representation that ignores the contextual cue, while the Hebbian step promotes the formation of a gating scheme from task units to the hidden layer that produces representations which are perfectly guarded against interference.
- Score: 4.205692673448206
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Humans can learn several tasks in succession with minimal mutual interference
but perform more poorly when trained on multiple tasks at once. The opposite is
true for standard deep neural networks. Here, we propose novel computational
constraints for artificial neural networks, inspired by earlier work on gating
in the primate prefrontal cortex, that capture the cost of interleaved training
and allow the network to learn two tasks in sequence without forgetting. We
augment standard stochastic gradient descent with two algorithmic motifs,
so-called "sluggish" task units and a Hebbian training step that strengthens
connections between task units and hidden units that encode task-relevant
information. We found that the "sluggish" units introduce a switch-cost during
training, which biases representations under interleaved training towards a
joint representation that ignores the contextual cue, while the Hebbian step
promotes the formation of a gating scheme from task units to the hidden layer
that produces orthogonal representations which are perfectly guarded against
interference. Validating the model on previously published human behavioural
data revealed that it matches performance of participants who had been trained
on blocked or interleaved curricula, and that these performance differences
were driven by misestimation of the true category boundary.
Related papers
- Negotiated Representations to Prevent Forgetting in Machine Learning
Applications [0.0]
Catastrophic forgetting is a significant challenge in the field of machine learning.
We propose a novel method for preventing catastrophic forgetting in machine learning applications.
arXiv Detail & Related papers (2023-11-30T22:43:50Z) - Look-Ahead Selective Plasticity for Continual Learning of Visual Tasks [9.82510084910641]
We propose a new mechanism that takes place during task boundaries, i.e., when one task finishes and another starts.
We evaluate the proposed methods on benchmark computer vision datasets including CIFAR10 and TinyImagenet.
arXiv Detail & Related papers (2023-11-02T22:00:23Z) - MENTOR: Human Perception-Guided Pretraining for Increased Generalization [5.596752018167751]
We introduce MENTOR (huMan pErceptioN-guided preTraining fOr increased geneRalization)
We train an autoencoder to learn human saliency maps given an input image, without class labels.
We remove the decoder part, add a classification layer on top of the encoder, and fine-tune this new model conventionally.
arXiv Detail & Related papers (2023-10-30T13:50:44Z) - ULTRA-DP: Unifying Graph Pre-training with Multi-task Graph Dual Prompt [67.8934749027315]
We propose a unified framework for graph hybrid pre-training which injects the task identification and position identification into GNNs.
We also propose a novel pre-training paradigm based on a group of $k$-nearest neighbors.
arXiv Detail & Related papers (2023-10-23T12:11:13Z) - On the training and generalization of deep operator networks [11.159056906971983]
We present a novel training method for deep operator networks (DeepONets)
DeepONets are constructed by two sub-networks.
We establish the width error estimate in terms of input data.
arXiv Detail & Related papers (2023-09-02T21:10:45Z) - Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks [69.38572074372392]
We present the first results proving that feature learning occurs during training with a nonlinear model on multiple tasks.
Our key insight is that multi-task pretraining induces a pseudo-contrastive loss that favors representations that align points that typically have the same label across tasks.
arXiv Detail & Related papers (2023-07-13T16:39:08Z) - On the relationship between disentanglement and multi-task learning [62.997667081978825]
We take a closer look at the relationship between disentanglement and multi-task learning based on hard parameter sharing.
We show that disentanglement appears naturally during the process of multi-task neural network training.
arXiv Detail & Related papers (2021-10-07T14:35:34Z) - Multi-task Supervised Learning via Cross-learning [102.64082402388192]
We consider a problem known as multi-task learning, consisting of fitting a set of regression functions intended for solving different tasks.
In our novel formulation, we couple the parameters of these functions, so that they learn in their task specific domains while staying close to each other.
This facilitates cross-fertilization in which data collected across different domains help improving the learning performance at each other task.
arXiv Detail & Related papers (2020-10-24T21:35:57Z) - Feature Purification: How Adversarial Training Performs Robust Deep
Learning [66.05472746340142]
We show a principle that we call Feature Purification, where we show one of the causes of the existence of adversarial examples is the accumulation of certain small dense mixtures in the hidden weights during the training process of a neural network.
We present both experiments on the CIFAR-10 dataset to illustrate this principle, and a theoretical result proving that for certain natural classification tasks, training a two-layer neural network with ReLU activation using randomly gradient descent indeed this principle.
arXiv Detail & Related papers (2020-05-20T16:56:08Z) - Improved Noise and Attack Robustness for Semantic Segmentation by Using
Multi-Task Training with Self-Supervised Depth Estimation [39.99513327031499]
We propose to improve robustness by a multi-task training, which extends supervised semantic segmentation by a self-supervised monocular depth estimation on unlabeled videos.
We show the effectiveness of our method on the Cityscapes dataset, where our multi-task training approach consistently outperforms the single-task semantic segmentation baseline.
arXiv Detail & Related papers (2020-04-23T11:03:56Z) - Towards Achieving Adversarial Robustness by Enforcing Feature
Consistency Across Bit Planes [51.31334977346847]
We train networks to form coarse impressions based on the information in higher bit planes, and use the lower bit planes only to refine their prediction.
We demonstrate that, by imposing consistency on the representations learned across differently quantized images, the adversarial robustness of networks improves significantly.
arXiv Detail & Related papers (2020-04-01T09:31:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.