Related papers: Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting

Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting

URL: http://arxiv.org/abs/2011.06220v3
Date: Mon, 10 May 2021 12:44:20 GMT
Title: Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting
Authors: Zeke Xie, Fengxiang He, Shaopeng Fu, Issei Sato, Dacheng Tao, and Masashi Sugiyama
Abstract summary: artificial neural variability (ANV) helps artificial neural networks learn some advantages from natural'' neural networks. ANV plays as an implicit regularizer of the mutual information between the training data and the learned model. It can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs.
Score: 135.0863818867184
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep learning is often criticized by two serious issues which rarely exist in natural nervous systems: overfitting and catastrophic forgetting. It can even memorize randomly labelled data, which has little knowledge behind the instance-label pairs. When a deep network continually learns over time by accommodating new tasks, it usually quickly overwrites the knowledge learned from previous tasks. Referred to as the {\it neural variability}, it is well-known in neuroscience that human brain reactions exhibit substantial variability even in response to the same stimulus. This mechanism balances accuracy and plasticity/flexibility in the motor learning of natural nervous systems. Thus it motivates us to design a similar mechanism named {\it artificial neural variability} (ANV), which helps artificial neural networks learn some advantages from ``natural'' neural networks. We rigorously prove that ANV plays as an implicit regularizer of the mutual information between the training data and the learned model. This result theoretically guarantees ANV a strictly improved generalizability, robustness to label noise, and robustness to catastrophic forgetting. We then devise a {\it neural variable risk minimization} (NVRM) framework and {\it neural variable optimizers} to achieve ANV for conventional network architectures in practice. The empirical studies demonstrate that NVRM can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs. \footnote{Code: \url{https://github.com/zeke-xie/artificial-neural-variability-for-deep-learning}.

Related papers

Integrating Causality with Neurochaos Learning: Proposed Approach and Research Agenda [1.534667887016089]
We investigate how causal and neurochaos learning approaches can be integrated together to produce better results. We propose an approach for this integration to enhance classification, prediction and reinforcement learning.
arXiv Detail & Related papers (2025-01-23T15:45:29Z)
Hebbian Learning based Orthogonal Projection for Continual Learning of Spiking Neural Networks [74.3099028063756]
We develop a new method with neuronal operations based on lateral connections and Hebbian learning. We show that Hebbian and anti-Hebbian learning on recurrent lateral connections can effectively extract the principal subspace of neural activities. Our method consistently solves for spiking neural networks with nearly zero forgetting.
arXiv Detail & Related papers (2024-02-19T09:29:37Z)
Control of synaptic plasticity in neural networks [0.0]
The brain is a nonlinear and highly Recurrent Neural Network (RNN) The proposed framework involves a new NN-based actor-critic method which is used to simulate the error feedback loop systems.
arXiv Detail & Related papers (2023-03-10T13:36:31Z)
Benign Overfitting for Two-layer ReLU Convolutional Neural Networks [60.19739010031304]
We establish algorithm-dependent risk bounds for learning two-layer ReLU convolutional neural networks with label-flipping noise. We show that, under mild conditions, the neural network trained by gradient descent can achieve near-zero training loss and Bayes optimal test risk.
arXiv Detail & Related papers (2023-03-07T18:59:38Z)
Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption. They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware. A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z)
Learning to Modulate Random Weights: Neuromodulation-inspired Neural Networks For Efficient Continual Learning [1.9580473532948401]
We introduce a novel neural network architecture inspired by neuromodulation in biological nervous systems. We show that this approach has strong learning performance per task despite the very small number of learnable parameters.
arXiv Detail & Related papers (2022-04-08T21:12:13Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Reducing Catastrophic Forgetting in Self Organizing Maps with Internally-Induced Generative Replay [67.50637511633212]
A lifelong learning agent is able to continually learn from potentially infinite streams of pattern sensory data. One major historic difficulty in building agents that adapt is that neural systems struggle to retain previously-acquired knowledge when learning from new samples. This problem is known as catastrophic forgetting (interference) and remains an unsolved problem in the domain of machine learning to this day.
arXiv Detail & Related papers (2021-12-09T07:11:14Z)
Synaptic Metaplasticity in Binarized Neural Networks [4.243926243206826]
Deep neural networks are prone to catastrophic forgetting upon training a new task. We propose and demonstrate experimentally, in situations of multitask and stream learning, a training technique that reduces catastrophic forgetting without needing previously presented data. This work bridges computational neuroscience and deep learning, and presents significant assets for future embedded and neuromorphic systems.
arXiv Detail & Related papers (2020-03-07T08:09:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.