Fast & Slow Learning: Incorporating Synthetic Gradients in Neural Memory
Controllers
- URL: http://arxiv.org/abs/2011.05438v1
- Date: Tue, 10 Nov 2020 22:44:27 GMT
- Title: Fast & Slow Learning: Incorporating Synthetic Gradients in Neural Memory
Controllers
- Authors: Tharindu Fernando, Simon Denman, Sridha Sridharan, Clinton Fookes
- Abstract summary: We propose to decouple the learning process of the NMN controllers to allow them to achieve flexible, rapid adaptation in the presence of new information.
This trait is highly beneficial for meta-learning tasks where the memory controllers must quickly grasp abstract concepts in the target domain, and adapt stored knowledge.
- Score: 41.59845953349713
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural Memory Networks (NMNs) have received increased attention in recent
years compared to deep architectures that use a constrained memory. Despite
their new appeal, the success of NMNs hinges on the ability of the
gradient-based optimiser to perform incremental training of the NMN
controllers, determining how to leverage their high capacity for knowledge
retrieval. This means that while excellent performance can be achieved when the
training data is consistent and well distributed, rare data samples are hard to
learn from as the controllers fail to incorporate them effectively during model
training. Drawing inspiration from the human cognition process, in particular
the utilisation of neuromodulators in the human brain, we propose to decouple
the learning process of the NMN controllers to allow them to achieve flexible,
rapid adaptation in the presence of new information. This trait is highly
beneficial for meta-learning tasks where the memory controllers must quickly
grasp abstract concepts in the target domain, and adapt stored knowledge. This
allows the NMN controllers to quickly determine which memories are to be
retained and which are to be erased, and swiftly adapt their strategy to the
new task at hand. Through both quantitative and qualitative evaluations on
multiple public benchmarks, including classification and regression tasks, we
demonstrate the utility of the proposed approach. Our evaluations not only
highlight the ability of the proposed NMN architecture to outperform the
current state-of-the-art methods, but also provide insights on how the proposed
augmentations help achieve such superior results. In addition, we demonstrate
the practical implications of the proposed learning strategy, where the
feedback path can be shared among multiple neural memory networks as a
mechanism for knowledge sharing.
Related papers
- Neuro-mimetic Task-free Unsupervised Online Learning with Continual
Self-Organizing Maps [56.827895559823126]
Self-organizing map (SOM) is a neural model often used in clustering and dimensionality reduction.
We propose a generalization of the SOM, the continual SOM, which is capable of online unsupervised learning under a low memory budget.
Our results, on benchmarks including MNIST, Kuzushiji-MNIST, and Fashion-MNIST, show almost a two times increase in accuracy.
arXiv Detail & Related papers (2024-02-19T19:11:22Z) - Accelerating Neural Network Training: A Brief Review [0.5825410941577593]
This study examines innovative approaches to expedite the training process of deep neural networks (DNN)
The research utilizes sophisticated methodologies, including Gradient Accumulation (GA), Automatic Mixed Precision (AMP), and Pin Memory (PM)
arXiv Detail & Related papers (2023-12-15T18:43:45Z) - Real-Time Progressive Learning: Accumulate Knowledge from Control with
Neural-Network-Based Selective Memory [2.8638167607890836]
A radial basis function neural network based learning control scheme named real-time progressive learning (RTPL) is proposed.
RTPL learns unknown dynamics of the system with guaranteed stability and closed-loop performance.
arXiv Detail & Related papers (2023-08-08T12:39:57Z) - Minimizing Control for Credit Assignment with Strong Feedback [65.59995261310529]
Current methods for gradient-based credit assignment in deep neural networks need infinitesimally small feedback signals.
We combine strong feedback influences on neural activity with gradient-based learning and show that this naturally leads to a novel view on neural network optimization.
We show that the use of strong feedback in DFC allows learning forward and feedback connections simultaneously, using a learning rule fully local in space and time.
arXiv Detail & Related papers (2022-04-14T22:06:21Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - Reducing Catastrophic Forgetting in Self Organizing Maps with
Internally-Induced Generative Replay [67.50637511633212]
A lifelong learning agent is able to continually learn from potentially infinite streams of pattern sensory data.
One major historic difficulty in building agents that adapt is that neural systems struggle to retain previously-acquired knowledge when learning from new samples.
This problem is known as catastrophic forgetting (interference) and remains an unsolved problem in the domain of machine learning to this day.
arXiv Detail & Related papers (2021-12-09T07:11:14Z) - CosSGD: Nonlinear Quantization for Communication-efficient Federated
Learning [62.65937719264881]
Federated learning facilitates learning across clients without transferring local data on these clients to a central server.
We propose a nonlinear quantization for compressed gradient descent, which can be easily utilized in federated learning.
Our system significantly reduces the communication cost by up to three orders of magnitude, while maintaining convergence and accuracy of the training process.
arXiv Detail & Related papers (2020-12-15T12:20:28Z) - Neuromodulated Neural Architectures with Local Error Signals for
Memory-Constrained Online Continual Learning [4.2903672492917755]
We develop a biologically-inspired light weight neural network architecture that incorporates local learning and neuromodulation.
We demonstrate the efficacy of our approach on both single task and continual learning setting.
arXiv Detail & Related papers (2020-07-16T07:41:23Z) - Dynamic Knowledge embedding and tracing [18.717482292051788]
We propose a novel approach to knowledge tracing that combines techniques from matrix factorization with recent progress in recurrent neural networks (RNNs)
The proposed emphDynEmb framework enables the tracking of student knowledge even without the concept/skill tag information.
arXiv Detail & Related papers (2020-05-18T21:56:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.