Activation by Interval-wise Dropout: A Simple Way to Prevent Neural Networks from Plasticity Loss
- URL: http://arxiv.org/abs/2502.01342v1
- Date: Mon, 03 Feb 2025 13:34:53 GMT
- Title: Activation by Interval-wise Dropout: A Simple Way to Prevent Neural Networks from Plasticity Loss
- Authors: Sangyeon Park, Isaac Han, Seungwon Oh, Kyung-Joong Kim,
- Abstract summary: Plasticity loss limits a model's ability to adapt to new tasks or shifts in data distribution.
This paper introduces AID (Activation by Interval-wise Dropout), a novel method inspired by Dropout to address plasticity loss.
We show that AID regularizes the network, promoting behavior analogous to that of deep linear networks, which do not suffer from plasticity loss.
- Score: 3.841822016067955
- License:
- Abstract: Plasticity loss, a critical challenge in neural network training, limits a model's ability to adapt to new tasks or shifts in data distribution. This paper introduces AID (Activation by Interval-wise Dropout), a novel method inspired by Dropout, designed to address plasticity loss. Unlike Dropout, AID generates subnetworks by applying Dropout with different probabilities on each preactivation interval. Theoretical analysis reveals that AID regularizes the network, promoting behavior analogous to that of deep linear networks, which do not suffer from plasticity loss. We validate the effectiveness of AID in maintaining plasticity across various benchmarks, including continual learning tasks on standard image classification datasets such as CIFAR10, CIFAR100, and TinyImageNet. Furthermore, we show that AID enhances reinforcement learning performance in the Arcade Learning Environment benchmark.
Related papers
- PseudoNeg-MAE: Self-Supervised Point Cloud Learning using Conditional Pseudo-Negative Embeddings [55.55445978692678]
PseudoNeg-MAE is a self-supervised learning framework that enhances global feature representation of point cloud mask autoencoders.
We show that PseudoNeg-MAE achieves state-of-the-art performance on the ModelNet40 and ScanObjectNN datasets.
arXiv Detail & Related papers (2024-09-24T07:57:21Z) - Learning A Spiking Neural Network for Efficient Image Deraining [20.270365030042623]
We present an Efficient Spiking Deraining Network, called ESDNet.
Our work is motivated by the observation that rain pixel values will lead to a more pronounced intensity of spike signals in SNNs.
We introduce a gradient proxy strategy to directly train the model for overcoming the challenge of training.
arXiv Detail & Related papers (2024-05-10T07:19:58Z) - Disentangling the Causes of Plasticity Loss in Neural Networks [55.23250269007988]
We show that loss of plasticity can be decomposed into multiple independent mechanisms.
We show that a combination of layer normalization and weight decay is highly effective at maintaining plasticity in a variety of synthetic nonstationary learning tasks.
arXiv Detail & Related papers (2024-02-29T00:02:33Z) - Learning a Low-Rank Feature Representation: Achieving Better Trade-Off
between Stability and Plasticity in Continual Learning [20.15493383736196]
In continual learning, networks confront a trade-off between stability and plasticity when trained on a sequence of tasks.
We propose a novel training algorithm called LRFR to bolster plasticity without sacrificing stability.
Using CIFAR-100 and TinyImageNet as benchmark datasets for continual learning, the proposed approach consistently outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-12-14T08:34:11Z) - PLASTIC: Improving Input and Label Plasticity for Sample Efficient
Reinforcement Learning [54.409634256153154]
In Reinforcement Learning (RL), enhancing sample efficiency is crucial.
In principle, off-policy RL algorithms can improve sample efficiency by allowing multiple updates per environment interaction.
Our study investigates the underlying causes of this phenomenon by dividing plasticity into two aspects.
arXiv Detail & Related papers (2023-06-19T06:14:51Z) - Deep Reinforcement Learning with Plasticity Injection [37.19742321534183]
Evidence suggests that in deep reinforcement learning (RL) networks gradually lose their plasticity.
plasticity injection increases the network plasticity without changing the number of parameters.
plasticity injection attains stronger performance compared to alternative methods.
arXiv Detail & Related papers (2023-05-24T20:41:35Z) - FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories.
We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z) - Learning a Domain-Agnostic Visual Representation for Autonomous Driving
via Contrastive Loss [25.798361683744684]
Domain-Agnostic Contrastive Learning (DACL) is a two-stage unsupervised domain adaptation framework with cyclic adversarial training and contrastive loss.
Our proposed approach achieves better performance in the monocular depth estimation task compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-10T07:06:03Z) - Implicit Under-Parameterization Inhibits Data-Efficient Deep
Reinforcement Learning [97.28695683236981]
More gradient updates decrease the expressivity of the current value network.
We demonstrate this phenomenon on Atari and Gym benchmarks, in both offline and online RL settings.
arXiv Detail & Related papers (2020-10-27T17:55:16Z) - Enabling Continual Learning with Differentiable Hebbian Plasticity [18.12749708143404]
Continual learning is the problem of sequentially learning new tasks or knowledge while protecting previously acquired knowledge.
catastrophic forgetting poses a grand challenge for neural networks performing such learning process.
We propose a Differentiable Hebbian Consolidation model which is composed of a Differentiable Hebbian Plasticity.
arXiv Detail & Related papers (2020-06-30T06:42:19Z) - Unsupervised Domain Adaptation in Person re-ID via k-Reciprocal
Clustering and Large-Scale Heterogeneous Environment Synthesis [76.46004354572956]
We introduce an unsupervised domain adaptation approach for person re-identification.
Experimental results show that the proposed ktCUDA and SHRED approach achieves an average improvement of +5.7 mAP in re-identification performance.
arXiv Detail & Related papers (2020-01-14T17:43:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.