Low-Precision Reinforcement Learning
- URL: http://arxiv.org/abs/2102.13565v1
- Date: Fri, 26 Feb 2021 16:16:28 GMT
- Title: Low-Precision Reinforcement Learning
- Authors: Johan Bjorck, Xiangyu Chen, Christopher De Sa, Carla P. Gomes, Kilian
Q. Weinberger
- Abstract summary: Low-precision training has become a popular approach to reduce computation time, memory footprint, and energy consumption in supervised learning.
In this paper we consider continuous control with the state-of-the-art SAC agent and demonstrate that a na"ive adaptation of low-precision methods from supervised learning fails.
- Score: 63.930246183244705
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Low-precision training has become a popular approach to reduce computation
time, memory footprint, and energy consumption in supervised learning. In
contrast, this promising approach has not enjoyed similarly widespread adoption
within the reinforcement learning (RL) community, in part because RL agents can
be notoriously hard to train -- even in full precision. In this paper we
consider continuous control with the state-of-the-art SAC agent and demonstrate
that a na\"ive adaptation of low-precision methods from supervised learning
fails. We propose a set of six modifications, all straightforward to implement,
that leaves the underlying agent unchanged but improves its numerical stability
dramatically. The resulting modified SAC agent has lower memory and compute
requirements while matching full-precision rewards, thus demonstrating the
feasibility of low-precision RL.
Related papers
- Dynamic Learning Rate for Deep Reinforcement Learning: A Bandit Approach [0.9549646359252346]
We propose dynamic Learning Rate for deep Reinforcement Learning (LRRL)
LRRL is a meta-learning approach that selects the learning rate based on the agent's performance during training.
Our empirical results demonstrate that LRRL can substantially improve the performance of deep RL algorithms.
arXiv Detail & Related papers (2024-10-16T14:15:28Z) - Criticality Leveraged Adversarial Training (CLAT) for Boosted Performance via Parameter Efficiency [15.211462468655329]
CLAT introduces parameter efficiency into the adversarial training process, improving both clean accuracy and adversarial robustness.
It can be applied on top of existing adversarial training methods, significantly reducing the number of trainable parameters by approximately 95%.
arXiv Detail & Related papers (2024-08-19T17:58:03Z) - Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - Sparse Low-rank Adaptation of Pre-trained Language Models [79.74094517030035]
We introduce sparse low-rank adaptation (SoRA) that enables dynamic adjustments to the intrinsic rank during the adaptation process.
Our approach strengthens the representation power of LoRA by initializing it with a higher rank, while efficiently taming a temporarily increased number of parameters.
Our experimental results demonstrate that SoRA can outperform other baselines even with 70% retained parameters and 70% training time.
arXiv Detail & Related papers (2023-11-20T11:56:25Z) - Augmenting Unsupervised Reinforcement Learning with Self-Reference [63.68018737038331]
Humans possess the ability to draw on past experiences explicitly when learning new tasks.
We propose the Self-Reference (SR) approach, an add-on module explicitly designed to leverage historical information.
Our approach achieves state-of-the-art results in terms of Interquartile Mean (IQM) performance and Optimality Gap reduction on the Unsupervised Reinforcement Learning Benchmark.
arXiv Detail & Related papers (2023-11-16T09:07:34Z) - Unbiased and Efficient Self-Supervised Incremental Contrastive Learning [31.763904668737304]
We propose a self-supervised Incremental Contrastive Learning (ICL) framework consisting of a novel Incremental InfoNCE (NCE-II) loss function.
ICL achieves up to 16.7x training speedup and 16.8x faster convergence with competitive results.
arXiv Detail & Related papers (2023-01-28T06:11:31Z) - Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement
Learning [44.50394347326546]
Off-policy learning from multistep returns is crucial for sample-efficient reinforcement learning.
Off-policy bias is corrected in a per-decision manner, but once a trace has been fully cut, the effect cannot be reversed.
We propose a multistep operator that can express both per-decision and trajectory-aware methods.
arXiv Detail & Related papers (2023-01-26T18:57:41Z) - Persistent Reinforcement Learning via Subgoal Curricula [114.83989499740193]
Value-accelerated Persistent Reinforcement Learning (VaPRL) generates a curriculum of initial states.
VaPRL reduces the interventions required by three orders of magnitude compared to episodic reinforcement learning.
arXiv Detail & Related papers (2021-07-27T16:39:45Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.