Artificial Neural Variability for Deep Learning: On Overfitting, Noise
Memorization, and Catastrophic Forgetting
- URL: http://arxiv.org/abs/2011.06220v3
- Date: Mon, 10 May 2021 12:44:20 GMT
- Title: Artificial Neural Variability for Deep Learning: On Overfitting, Noise
Memorization, and Catastrophic Forgetting
- Authors: Zeke Xie, Fengxiang He, Shaopeng Fu, Issei Sato, Dacheng Tao, and
Masashi Sugiyama
- Abstract summary: artificial neural variability (ANV) helps artificial neural networks learn some advantages from natural'' neural networks.
ANV plays as an implicit regularizer of the mutual information between the training data and the learned model.
It can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs.
- Score: 135.0863818867184
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning is often criticized by two serious issues which rarely exist in
natural nervous systems: overfitting and catastrophic forgetting. It can even
memorize randomly labelled data, which has little knowledge behind the
instance-label pairs. When a deep network continually learns over time by
accommodating new tasks, it usually quickly overwrites the knowledge learned
from previous tasks. Referred to as the {\it neural variability}, it is
well-known in neuroscience that human brain reactions exhibit substantial
variability even in response to the same stimulus. This mechanism balances
accuracy and plasticity/flexibility in the motor learning of natural nervous
systems. Thus it motivates us to design a similar mechanism named {\it
artificial neural variability} (ANV), which helps artificial neural networks
learn some advantages from ``natural'' neural networks. We rigorously prove
that ANV plays as an implicit regularizer of the mutual information between the
training data and the learned model. This result theoretically guarantees ANV a
strictly improved generalizability, robustness to label noise, and robustness
to catastrophic forgetting. We then devise a {\it neural variable risk
minimization} (NVRM) framework and {\it neural variable optimizers} to achieve
ANV for conventional network architectures in practice. The empirical studies
demonstrate that NVRM can effectively relieve overfitting, label noise
memorization, and catastrophic forgetting at negligible costs. \footnote{Code:
\url{https://github.com/zeke-xie/artificial-neural-variability-for-deep-learning}.
Related papers
- Hebbian Learning based Orthogonal Projection for Continual Learning of
Spiking Neural Networks [74.3099028063756]
We develop a new method with neuronal operations based on lateral connections and Hebbian learning.
We show that Hebbian and anti-Hebbian learning on recurrent lateral connections can effectively extract the principal subspace of neural activities.
Our method consistently solves for spiking neural networks with nearly zero forgetting.
arXiv Detail & Related papers (2024-02-19T09:29:37Z) - Control of synaptic plasticity in neural networks [0.0]
The brain is a nonlinear and highly Recurrent Neural Network (RNN)
The proposed framework involves a new NN-based actor-critic method which is used to simulate the error feedback loop systems.
arXiv Detail & Related papers (2023-03-10T13:36:31Z) - Benign Overfitting for Two-layer ReLU Convolutional Neural Networks [60.19739010031304]
We establish algorithm-dependent risk bounds for learning two-layer ReLU convolutional neural networks with label-flipping noise.
We show that, under mild conditions, the neural network trained by gradient descent can achieve near-zero training loss and Bayes optimal test risk.
arXiv Detail & Related papers (2023-03-07T18:59:38Z) - Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption.
They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware.
A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z) - Learning to Modulate Random Weights: Neuromodulation-inspired Neural
Networks For Efficient Continual Learning [1.9580473532948401]
We introduce a novel neural network architecture inspired by neuromodulation in biological nervous systems.
We show that this approach has strong learning performance per task despite the very small number of learnable parameters.
arXiv Detail & Related papers (2022-04-08T21:12:13Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Reducing Catastrophic Forgetting in Self Organizing Maps with
Internally-Induced Generative Replay [67.50637511633212]
A lifelong learning agent is able to continually learn from potentially infinite streams of pattern sensory data.
One major historic difficulty in building agents that adapt is that neural systems struggle to retain previously-acquired knowledge when learning from new samples.
This problem is known as catastrophic forgetting (interference) and remains an unsolved problem in the domain of machine learning to this day.
arXiv Detail & Related papers (2021-12-09T07:11:14Z) - Synaptic Metaplasticity in Binarized Neural Networks [4.243926243206826]
Deep neural networks are prone to catastrophic forgetting upon training a new task.
We propose and demonstrate experimentally, in situations of multitask and stream learning, a training technique that reduces catastrophic forgetting without needing previously presented data.
This work bridges computational neuroscience and deep learning, and presents significant assets for future embedded and neuromorphic systems.
arXiv Detail & Related papers (2020-03-07T08:09:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.