Related papers: Activation by Interval-wise Dropout: A Simple Way to Prevent Neural Networks from Plasticity Loss

Activation by Interval-wise Dropout: A Simple Way to Prevent Neural Networks from Plasticity Loss

URL: http://arxiv.org/abs/2502.01342v1
Date: Mon, 03 Feb 2025 13:34:53 GMT
Title: Activation by Interval-wise Dropout: A Simple Way to Prevent Neural Networks from Plasticity Loss
Authors: Sangyeon Park, Isaac Han, Seungwon Oh, Kyung-Joong Kim,
Abstract summary: Plasticity loss limits a model's ability to adapt to new tasks or shifts in data distribution.<n>This paper introduces AID (Activation by Interval-wise Dropout), a novel method inspired by Dropout to address plasticity loss.<n>We show that AID regularizes the network, promoting behavior analogous to that of deep linear networks, which do not suffer from plasticity loss.
Score: 3.841822016067955
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Plasticity loss, a critical challenge in neural network training, limits a model's ability to adapt to new tasks or shifts in data distribution. This paper introduces AID (Activation by Interval-wise Dropout), a novel method inspired by Dropout, designed to address plasticity loss. Unlike Dropout, AID generates subnetworks by applying Dropout with different probabilities on each preactivation interval. Theoretical analysis reveals that AID regularizes the network, promoting behavior analogous to that of deep linear networks, which do not suffer from plasticity loss. We validate the effectiveness of AID in maintaining plasticity across various benchmarks, including continual learning tasks on standard image classification datasets such as CIFAR10, CIFAR100, and TinyImageNet. Furthermore, we show that AID enhances reinforcement learning performance in the Arcade Learning Environment benchmark.

Related papers

Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning [57.3885832382455]
We show that introducing static network sparsity alone can unlock further scaling potential beyond dense counterparts with state-of-the-art architectures.<n>Our analysis reveals that, in contrast to naively scaling up dense DRL networks, such sparse networks achieve both higher parameter efficiency for network expressivity.
arXiv Detail & Related papers (2025-06-20T17:54:24Z)
Preserving Plasticity in Continual Learning with Adaptive Linearity Injection [10.641213440191551]
Loss of plasticity in deep neural networks is the gradual reduction in a model's capacity to incrementally learn.<n>Recent work has shown that deep linear networks tend to be resilient towards loss of plasticity.<n>We propose Adaptive Linearization (AdaLin), a general approach that dynamically adapts each neuron's activation function to mitigate plasticity loss.
arXiv Detail & Related papers (2025-05-14T15:36:51Z)
PseudoNeg-MAE: Self-Supervised Point Cloud Learning using Conditional Pseudo-Negative Embeddings [55.55445978692678]
PseudoNeg-MAE is a self-supervised learning framework that enhances global feature representation of point cloud mask autoencoders. We show that PseudoNeg-MAE achieves state-of-the-art performance on the ModelNet40 and ScanObjectNN datasets.
arXiv Detail & Related papers (2024-09-24T07:57:21Z)
Learning A Spiking Neural Network for Efficient Image Deraining [20.270365030042623]
We present an Efficient Spiking Deraining Network, called ESDNet. Our work is motivated by the observation that rain pixel values will lead to a more pronounced intensity of spike signals in SNNs. We introduce a gradient proxy strategy to directly train the model for overcoming the challenge of training.
arXiv Detail & Related papers (2024-05-10T07:19:58Z)
Disentangling the Causes of Plasticity Loss in Neural Networks [55.23250269007988]
We show that loss of plasticity can be decomposed into multiple independent mechanisms. We show that a combination of layer normalization and weight decay is highly effective at maintaining plasticity in a variety of synthetic nonstationary learning tasks.
arXiv Detail & Related papers (2024-02-29T00:02:33Z)
Learning a Low-Rank Feature Representation: Achieving Better Trade-Off between Stability and Plasticity in Continual Learning [20.15493383736196]
In continual learning, networks confront a trade-off between stability and plasticity when trained on a sequence of tasks. We propose a novel training algorithm called LRFR to bolster plasticity without sacrificing stability. Using CIFAR-100 and TinyImageNet as benchmark datasets for continual learning, the proposed approach consistently outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-12-14T08:34:11Z)
PLASTIC: Improving Input and Label Plasticity for Sample Efficient Reinforcement Learning [54.409634256153154]
In Reinforcement Learning (RL), enhancing sample efficiency is crucial. In principle, off-policy RL algorithms can improve sample efficiency by allowing multiple updates per environment interaction. Our study investigates the underlying causes of this phenomenon by dividing plasticity into two aspects.
arXiv Detail & Related papers (2023-06-19T06:14:51Z)
Deep Reinforcement Learning with Plasticity Injection [37.19742321534183]
Evidence suggests that in deep reinforcement learning (RL) networks gradually lose their plasticity. plasticity injection increases the network plasticity without changing the number of parameters. plasticity injection attains stronger performance compared to alternative methods.
arXiv Detail & Related papers (2023-05-24T20:41:35Z)
A Generic Shared Attention Mechanism for Various Backbone Neural Networks [53.36677373145012]
Self-attention modules (SAMs) produce strongly correlated attention maps across different layers. Dense-and-Implicit Attention (DIA) shares SAMs across layers and employs a long short-term memory module. Our simple yet effective DIA can consistently enhance various network backbones.
arXiv Detail & Related papers (2022-10-27T13:24:08Z)
FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories. We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z)
Learning a Domain-Agnostic Visual Representation for Autonomous Driving via Contrastive Loss [25.798361683744684]
Domain-Agnostic Contrastive Learning (DACL) is a two-stage unsupervised domain adaptation framework with cyclic adversarial training and contrastive loss. Our proposed approach achieves better performance in the monocular depth estimation task compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-10T07:06:03Z)
Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning [97.28695683236981]
More gradient updates decrease the expressivity of the current value network. We demonstrate this phenomenon on Atari and Gym benchmarks, in both offline and online RL settings.
arXiv Detail & Related papers (2020-10-27T17:55:16Z)
Enabling Continual Learning with Differentiable Hebbian Plasticity [18.12749708143404]
Continual learning is the problem of sequentially learning new tasks or knowledge while protecting previously acquired knowledge. catastrophic forgetting poses a grand challenge for neural networks performing such learning process. We propose a Differentiable Hebbian Consolidation model which is composed of a Differentiable Hebbian Plasticity.
arXiv Detail & Related papers (2020-06-30T06:42:19Z)
Unsupervised Domain Adaptation in Person re-ID via k-Reciprocal Clustering and Large-Scale Heterogeneous Environment Synthesis [76.46004354572956]
We introduce an unsupervised domain adaptation approach for person re-identification. Experimental results show that the proposed ktCUDA and SHRED approach achieves an average improvement of +5.7 mAP in re-identification performance.
arXiv Detail & Related papers (2020-01-14T17:43:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.