PLASTIC: Improving Input and Label Plasticity for Sample Efficient
Reinforcement Learning
- URL: http://arxiv.org/abs/2306.10711v3
- Date: Fri, 8 Dec 2023 21:12:47 GMT
- Title: PLASTIC: Improving Input and Label Plasticity for Sample Efficient
Reinforcement Learning
- Authors: Hojoon Lee, Hanseul Cho, Hyunseung Kim, Daehoon Gwak, Joonkee Kim,
Jaegul Choo, Se-Young Yun, Chulhee Yun
- Abstract summary: In Reinforcement Learning (RL), enhancing sample efficiency is crucial.
In principle, off-policy RL algorithms can improve sample efficiency by allowing multiple updates per environment interaction.
Our study investigates the underlying causes of this phenomenon by dividing plasticity into two aspects.
- Score: 54.409634256153154
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In Reinforcement Learning (RL), enhancing sample efficiency is crucial,
particularly in scenarios when data acquisition is costly and risky. In
principle, off-policy RL algorithms can improve sample efficiency by allowing
multiple updates per environment interaction. However, these multiple updates
often lead the model to overfit to earlier interactions, which is referred to
as the loss of plasticity. Our study investigates the underlying causes of this
phenomenon by dividing plasticity into two aspects. Input plasticity, which
denotes the model's adaptability to changing input data, and label plasticity,
which denotes the model's adaptability to evolving input-output relationships.
Synthetic experiments on the CIFAR-10 dataset reveal that finding smoother
minima of loss landscape enhances input plasticity, whereas refined gradient
propagation improves label plasticity. Leveraging these findings, we introduce
the PLASTIC algorithm, which harmoniously combines techniques to address both
concerns. With minimal architectural modifications, PLASTIC achieves
competitive performance on benchmarks including Atari-100k and Deepmind Control
Suite. This result emphasizes the importance of preserving the model's
plasticity to elevate the sample efficiency in RL. The code is available at
https://github.com/dojeon-ai/plastic.
Related papers
- Leveraging Variation Theory in Counterfactual Data Augmentation for Optimized Active Learning [19.962212551963383]
Active Learning (AL) allows models to learn interactively from user feedback.
This paper introduces a counterfactual data augmentation approach to AL.
arXiv Detail & Related papers (2024-08-07T14:55:04Z) - Synthetic Image Learning: Preserving Performance and Preventing Membership Inference Attacks [5.0243930429558885]
This paper introduces Knowledge Recycling (KR), a pipeline designed to optimise the generation and use of synthetic data for training downstream classifiers.
At the heart of this pipeline is Generative Knowledge Distillation (GKD), the proposed technique that significantly improves the quality and usefulness of the information.
The results show a significant reduction in the performance gap between models trained on real and synthetic data, with models based on synthetic data outperforming those trained on real data in some cases.
arXiv Detail & Related papers (2024-07-22T10:31:07Z) - Unveiling the Flaws: Exploring Imperfections in Synthetic Data and Mitigation Strategies for Large Language Models [89.88010750772413]
Synthetic data has been proposed as a solution to address the issue of high-quality data scarcity in the training of large language models (LLMs)
Our work delves into these specific flaws associated with question-answer (Q-A) pairs, a prevalent type of synthetic data, and presents a method based on unlearning techniques to mitigate these flaws.
Our work has yielded key insights into the effective use of synthetic data, aiming to promote more robust and efficient LLM training.
arXiv Detail & Related papers (2024-06-18T08:38:59Z) - A Discrepancy Aware Framework for Robust Anomaly Detection [51.710249807397695]
We present a Discrepancy Aware Framework (DAF), which demonstrates robust performance consistently with simple and cheap strategies.
Our method leverages an appearance-agnostic cue to guide the decoder in identifying defects, thereby alleviating its reliance on synthetic appearance.
Under the simple synthesis strategies, it outperforms existing methods by a large margin. Furthermore, it also achieves the state-of-the-art localization performance.
arXiv Detail & Related papers (2023-10-11T15:21:40Z) - Revisiting Plasticity in Visual Reinforcement Learning: Data, Modules and Training Stages [56.98243487769916]
Plasticity, the ability of a neural network to evolve with new data, is crucial for high-performance and sample-efficient visual reinforcement learning.
We propose Adaptive RR, which dynamically adjusts the replay ratio based on the critic's plasticity level.
arXiv Detail & Related papers (2023-10-11T12:05:34Z) - Robust Learning with Progressive Data Expansion Against Spurious
Correlation [65.83104529677234]
We study the learning process of a two-layer nonlinear convolutional neural network in the presence of spurious features.
Our analysis suggests that imbalanced data groups and easily learnable spurious features can lead to the dominance of spurious features during the learning process.
We propose a new training algorithm called PDE that efficiently enhances the model's robustness for a better worst-group performance.
arXiv Detail & Related papers (2023-06-08T05:44:06Z) - Deep Reinforcement Learning with Plasticity Injection [37.19742321534183]
Evidence suggests that in deep reinforcement learning (RL) networks gradually lose their plasticity.
plasticity injection increases the network plasticity without changing the number of parameters.
plasticity injection attains stronger performance compared to alternative methods.
arXiv Detail & Related papers (2023-05-24T20:41:35Z) - Contrastive Model Inversion for Data-Free Knowledge Distillation [60.08025054715192]
We propose Contrastive Model Inversion, where the data diversity is explicitly modeled as an optimizable objective.
Our main observation is that, under the constraint of the same amount of data, higher data diversity usually indicates stronger instance discrimination.
Experiments on CIFAR-10, CIFAR-100, and Tiny-ImageNet demonstrate that CMI achieves significantly superior performance when the generated data are used for knowledge distillation.
arXiv Detail & Related papers (2021-05-18T15:13:00Z) - Unveiling the role of plasticity rules in reservoir computing [0.0]
Reservoir Computing (RC) is an appealing approach in Machine Learning.
We analyze the role that plasticity rules play on the changes that lead to a better performance of RC.
arXiv Detail & Related papers (2021-01-14T19:55:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.