Prioritizing Samples in Reinforcement Learning with Reducible Loss
- URL: http://arxiv.org/abs/2208.10483v3
- Date: Wed, 1 Nov 2023 15:06:08 GMT
- Title: Prioritizing Samples in Reinforcement Learning with Reducible Loss
- Authors: Shivakanth Sujit, Somjit Nath, Pedro H. M. Braga, Samira Ebrahimi
Kahou
- Abstract summary: We propose a method to prioritize samples based on how much we can learn from a sample.
We develop an algorithm to prioritize samples with high learn-ability, while assigning lower priority to those that are hard-to-learn.
- Score: 5.901819658403315
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most reinforcement learning algorithms take advantage of an experience replay
buffer to repeatedly train on samples the agent has observed in the past. Not
all samples carry the same amount of significance and simply assigning equal
importance to each of the samples is a na\"ive strategy. In this paper, we
propose a method to prioritize samples based on how much we can learn from a
sample. We define the learn-ability of a sample as the steady decrease of the
training loss associated with this sample over time. We develop an algorithm to
prioritize samples with high learn-ability, while assigning lower priority to
those that are hard-to-learn, typically caused by noise or stochasticity. We
empirically show that our method is more robust than random sampling and also
better than just prioritizing with respect to the training loss, i.e. the
temporal difference loss, which is used in prioritized experience replay.
Related papers
- Non-Uniform Memory Sampling in Experience Replay [1.9580473532948401]
A popular strategy to alleviate catastrophic forgetting is experience replay.
Most approaches assume that sampling from this buffer is uniform by default.
We generate 50 different non-uniform sampling probability weights for each trial and compare their final accuracy to the uniform sampling baseline.
arXiv Detail & Related papers (2025-02-16T23:04:16Z) - Enhancing Consistency and Mitigating Bias: A Data Replay Approach for
Incremental Learning [100.7407460674153]
Deep learning systems are prone to catastrophic forgetting when learning from a sequence of tasks.
To mitigate the problem, a line of methods propose to replay the data of experienced tasks when learning new tasks.
However, it is not expected in practice considering the memory constraint or data privacy issue.
As a replacement, data-free data replay methods are proposed by inverting samples from the classification model.
arXiv Detail & Related papers (2024-01-12T12:51:12Z) - Data Pruning via Moving-one-Sample-out [61.45441981346064]
We propose a novel data-pruning approach called moving-one-sample-out (MoSo)
MoSo aims to identify and remove the least informative samples from the training set.
Experimental results demonstrate that MoSo effectively mitigates severe performance degradation at high pruning ratios.
arXiv Detail & Related papers (2023-10-23T08:00:03Z) - DiscrimLoss: A Universal Loss for Hard Samples and Incorrect Samples
Discrimination [28.599571524763785]
Given data with label noise (i.e., incorrect data), deep neural networks would gradually memorize the label noise and impair model performance.
To relieve this issue, curriculum learning is proposed to improve model performance and generalization by ordering training samples in a meaningful sequence.
arXiv Detail & Related papers (2022-08-21T13:38:55Z) - An analysis of over-sampling labeled data in semi-supervised learning
with FixMatch [66.34968300128631]
Most semi-supervised learning methods over-sample labeled data when constructing training mini-batches.
This paper studies whether this common practice improves learning and how.
We compare it to an alternative setting where each mini-batch is uniformly sampled from all the training data, labeled or not.
arXiv Detail & Related papers (2022-01-03T12:22:26Z) - Rethinking Sampling Strategies for Unsupervised Person Re-identification [59.47536050785886]
We analyze the reasons for the performance differences between various sampling strategies under the same framework and loss function.
Group sampling is proposed, which gathers samples from the same class into groups.
Experiments on Market-1501, DukeMTMC-reID and MSMT17 show that group sampling achieves performance comparable to state-of-the-art methods.
arXiv Detail & Related papers (2021-07-07T05:39:58Z) - Jo-SRC: A Contrastive Approach for Combating Noisy Labels [58.867237220886885]
We propose a noise-robust approach named Jo-SRC (Joint Sample Selection and Model Regularization based on Consistency)
Specifically, we train the network in a contrastive learning manner. Predictions from two different views of each sample are used to estimate its "likelihood" of being clean or out-of-distribution.
arXiv Detail & Related papers (2021-03-24T07:26:07Z) - Robust Sampling in Deep Learning [62.997667081978825]
Deep learning requires regularization mechanisms to reduce overfitting and improve generalization.
We address this problem by a new regularization method based on distributional robust optimization.
During the training, the selection of samples is done according to their accuracy in such a way that the worst performed samples are the ones that contribute the most in the optimization.
arXiv Detail & Related papers (2020-06-04T09:46:52Z) - Minority Class Oversampling for Tabular Data with Deep Generative Models [4.976007156860967]
We study the ability of deep generative models to provide realistic samples that improve performance on imbalanced classification tasks via oversampling.
Our experiments show that the way the method of sampling does not affect quality, but runtime varies widely.
We also observe that the improvements in terms of performance metric, while shown to be significant, often are minor in absolute terms.
arXiv Detail & Related papers (2020-05-07T21:35:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.