Action Noise in Off-Policy Deep Reinforcement Learning: Impact on
Exploration and Performance
- URL: http://arxiv.org/abs/2206.03787v3
- Date: Mon, 5 Jun 2023 16:21:56 GMT
- Title: Action Noise in Off-Policy Deep Reinforcement Learning: Impact on
Exploration and Performance
- Authors: Jakob Hollenstein, Sayantan Auddy, Matteo Saveriano, Erwan Renaudo,
Justus Piater
- Abstract summary: We analyze how the learned policy is impacted by the noise type, noise scale, and impact scaling factor reduction schedule.
We consider the two most prominent types of action noise, Ornstein-Uhlenbeck noise, and perform a vast experimental campaign.
We conclude that the best noise type and scale are environment dependent, and based on our observations derive rules for guiding the choice of the action noise.
- Score: 5.573543601558405
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many Deep Reinforcement Learning (D-RL) algorithms rely on simple forms of
exploration such as the additive action noise often used in continuous control
domains. Typically, the scaling factor of this action noise is chosen as a
hyper-parameter and is kept constant during training. In this paper, we focus
on action noise in off-policy deep reinforcement learning for continuous
control. We analyze how the learned policy is impacted by the noise type, noise
scale, and impact scaling factor reduction schedule. We consider the two most
prominent types of action noise, Gaussian and Ornstein-Uhlenbeck noise, and
perform a vast experimental campaign by systematically varying the noise type
and scale parameter, and by measuring variables of interest like the expected
return of the policy and the state-space coverage during exploration. For the
latter, we propose a novel state-space coverage measure
$\operatorname{X}_{\mathcal{U}\text{rel}}$ that is more robust to estimation
artifacts caused by points close to the state-space boundary than
previously-proposed measures. Larger noise scales generally increase
state-space coverage. However, we found that increasing the space coverage
using a larger noise scale is often not beneficial. On the contrary, reducing
the noise scale over the training process reduces the variance and generally
improves the learning performance. We conclude that the best noise type and
scale are environment dependent, and based on our observations derive heuristic
rules for guiding the choice of the action noise as a starting point for
further optimization.
Related papers
- Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets.
We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z) - Understanding the Effect of Noise in LLM Training Data with Algorithmic
Chains of Thought [0.0]
We study how noise in chain of thought impacts task performance in highly-controlled setting.
We define two types of noise: textitstatic noise, a local form of noise which is applied after the CoT trace is computed, and textitdynamic noise, a global form of noise which propagates errors in the trace as it is computed.
We find fine-tuned models are extremely robust to high levels of static noise but struggle significantly more with lower levels of dynamic noise.
arXiv Detail & Related papers (2024-02-06T13:59:56Z) - Universal Noise Annotation: Unveiling the Impact of Noisy annotation on
Object Detection [36.318411642128446]
We propose Universal-Noise.
(UNA), a more practical setting that encompasses all types of noise that can occur in object detection.
We analyzed the development direction of previous works of detection algorithms and examined the factors that impact the robustness of detection model learning method.
We open-source the code for injecting UNA into the dataset and all the training log and weight are also shared.
arXiv Detail & Related papers (2023-12-21T13:12:37Z) - Understanding and Mitigating the Label Noise in Pre-training on
Downstream Tasks [91.15120211190519]
This paper aims to understand the nature of noise in pre-training datasets and to mitigate its impact on downstream tasks.
We propose a light-weight black-box tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise.
arXiv Detail & Related papers (2023-09-29T06:18:15Z) - Label Noise: Correcting the Forward-Correction [0.0]
Training neural network classifiers on datasets with label noise poses a risk of overfitting them to the noisy labels.
We propose an approach to tackling overfitting caused by label noise.
Motivated by this observation, we propose imposing a lower bound on the training loss to mitigate overfitting.
arXiv Detail & Related papers (2023-07-24T19:41:19Z) - Latent Exploration for Reinforcement Learning [87.42776741119653]
In Reinforcement Learning, agents learn policies by exploring and interacting with the environment.
We propose LATent TIme-Correlated Exploration (Lattice), a method to inject temporally-correlated noise into the latent state of the policy network.
arXiv Detail & Related papers (2023-05-31T17:40:43Z) - Boundary-Denoising for Video Activity Localization [57.9973253014712]
We study the video activity localization problem from a denoising perspective.
Specifically, we propose an encoder-decoder model named DenoiseLoc.
Experiments show that DenoiseLoc advances %in several video activity understanding tasks.
arXiv Detail & Related papers (2023-04-06T08:48:01Z) - Combating Noise: Semi-supervised Learning by Region Uncertainty
Quantification [55.23467274564417]
Current methods are easily distracted by noisy regions generated by pseudo labels.
We propose noise-resistant semi-supervised learning by quantifying the region uncertainty.
Experiments on both PASCAL VOC and MS COCO demonstrate the extraordinary performance of our method.
arXiv Detail & Related papers (2021-11-01T13:23:42Z) - On Dynamic Noise Influence in Differentially Private Learning [102.6791870228147]
Private Gradient Descent (PGD) is a commonly used private learning framework, which noises based on the Differential protocol.
Recent studies show that emphdynamic privacy schedules can improve at the final iteration, yet yet theoreticals of the effectiveness of such schedules remain limited.
This paper provides comprehensive analysis of noise influence in dynamic privacy schedules to answer these critical questions.
arXiv Detail & Related papers (2021-01-19T02:04:00Z) - Disturbances in Influence of a Shepherding Agent is More Impactful than
Sensorial Noise During Swarm Guidance [0.2624902795082451]
The impact of noise on shepherding is not a well-studied problem.
We evaluate noise in the sensorial information received by the shepherd about the location of sheep.
Second, we evaluate noise in the ability of the sheepdog to influence sheep due to disturbance forces occurring during actuation.
arXiv Detail & Related papers (2020-08-28T15:40:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.