Just Round: Quantized Observation Spaces Enable Memory Efficient
Learning of Dynamic Locomotion
- URL: http://arxiv.org/abs/2210.08065v2
- Date: Sat, 22 Apr 2023 16:33:15 GMT
- Title: Just Round: Quantized Observation Spaces Enable Memory Efficient
Learning of Dynamic Locomotion
- Authors: Lev Grossman and Brian Plancher
- Abstract summary: Training deep reinforcement learning models is compute and memory intensive.
observation space quantization reduces overall memory costs by as much as 4.2x without impacting learning performance.
- Score: 0.7106986689736827
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep reinforcement learning (DRL) is one of the most powerful tools for
synthesizing complex robotic behaviors. But training DRL models is incredibly
compute and memory intensive, requiring large training datasets and replay
buffers to achieve performant results. This poses a challenge for the next
generation of field robots that will need to learn on the edge to adapt to
their environment. In this paper, we begin to address this issue through
observation space quantization. We evaluate our approach using four simulated
robot locomotion tasks and two state-of-the-art DRL algorithms, the on-policy
Proximal Policy Optimization (PPO) and off-policy Soft Actor-Critic (SAC) and
find that observation space quantization reduces overall memory costs by as
much as 4.2x without impacting learning performance.
Related papers
- PLANRL: A Motion Planning and Imitation Learning Framework to Bootstrap Reinforcement Learning [13.564676246832544]
We introduce PLANRL, a framework that chooses when the robot should use classical motion planning and when it should learn a policy.
PLANRL switches between two modes of operation: reaching a waypoint using classical techniques when away from the objects and fine-grained manipulation control when about to interact with objects.
We evaluate our approach across multiple challenging simulation environments and real-world tasks, demonstrating superior performance in terms of adaptability, efficiency, and generalization compared to existing methods.
arXiv Detail & Related papers (2024-08-07T19:30:08Z) - SERL: A Software Suite for Sample-Efficient Robotic Reinforcement
Learning [85.21378553454672]
We develop a library containing a sample efficient off-policy deep RL method, together with methods for computing rewards and resetting the environment.
We find that our implementation can achieve very efficient learning, acquiring policies for PCB board assembly, cable routing, and object relocation.
These policies achieve perfect or near-perfect success rates, extreme robustness even under perturbations, and exhibit emergent robustness recovery and correction behaviors.
arXiv Detail & Related papers (2024-01-29T10:01:10Z) - Action-Quantized Offline Reinforcement Learning for Robotic Skill
Learning [68.16998247593209]
offline reinforcement learning (RL) paradigm provides recipe to convert static behavior datasets into policies that can perform better than the policy that collected the data.
In this paper, we propose an adaptive scheme for action quantization.
We show that several state-of-the-art offline RL methods such as IQL, CQL, and BRAC improve in performance on benchmarks when combined with our proposed discretization scheme.
arXiv Detail & Related papers (2023-10-18T06:07:10Z) - A Real-World Quadrupedal Locomotion Benchmark for Offline Reinforcement
Learning [27.00483962026472]
We benchmark 11 offline reinforcement learning algorithms in realistic quadrupedal locomotion dataset.
Experiments show that the best-performing ORL algorithms can achieve competitive performance compared with the model-free RL.
Our proposed benchmark will serve as a development platform for testing and evaluating the performance of ORL algorithms in real-world legged locomotion tasks.
arXiv Detail & Related papers (2023-09-13T13:18:29Z) - R^3: On-device Real-Time Deep Reinforcement Learning for Autonomous
Robotics [9.2327813168753]
This paper presents R3, a holistic solution for managing timing, memory, and algorithm performance in on-device real-time DRL training.
R3 employs (i) a deadline-driven feedback loop with dynamic batch sizing for optimizing timing, (ii) efficient memory management to reduce memory footprint and allow larger replay buffer sizes, and (iii) a runtime coordinator guided by runtime analysis and a runtime profiler for adjusting memory resource reservations.
arXiv Detail & Related papers (2023-08-29T05:48:28Z) - Decision S4: Efficient Sequence-Based RL via State Spaces Layers [87.3063565438089]
We present an off-policy training procedure that works with trajectories, while still maintaining the training efficiency of the S4 model.
An on-policy training procedure that is trained in a recurrent manner, benefits from long-range dependencies, and is based on a novel stable actor-critic mechanism.
arXiv Detail & Related papers (2023-06-08T13:03:53Z) - Real World Offline Reinforcement Learning with Realistic Data Source [33.7474988142367]
offline reinforcement learning (ORL) holds great promise for robot learning due to its ability to learn from arbitrary pre-generated experience.
Current ORL benchmarks are almost entirely in simulation and utilize contrived datasets like replay buffers of online RL agents or sub-optimal trajectories.
In this work (Real-ORL), we posit that data collected from safe operations of closely related tasks are more practical data sources for real-world robot learning.
arXiv Detail & Related papers (2022-10-12T17:57:05Z) - Accelerating Robotic Reinforcement Learning via Parameterized Action
Primitives [92.0321404272942]
Reinforcement learning can be used to build general-purpose robotic systems.
However, training RL agents to solve robotics tasks still remains challenging.
In this work, we manually specify a library of robot action primitives (RAPS), parameterized with arguments that are learned by an RL policy.
We find that our simple change to the action interface substantially improves both the learning efficiency and task performance.
arXiv Detail & Related papers (2021-10-28T17:59:30Z) - Improving Computational Efficiency in Visual Reinforcement Learning via
Stored Embeddings [89.63764845984076]
We present Stored Embeddings for Efficient Reinforcement Learning (SEER)
SEER is a simple modification of existing off-policy deep reinforcement learning methods.
We show that SEER does not degrade the performance of RLizable agents while significantly saving computation and memory.
arXiv Detail & Related papers (2021-03-04T08:14:10Z) - A Framework for Efficient Robotic Manipulation [79.10407063260473]
We show that a single robotic arm can learn sparse-reward manipulation policies from pixels.
We show that, given only 10 demonstrations, a single robotic arm can learn sparse-reward manipulation policies from pixels.
arXiv Detail & Related papers (2020-12-14T22:18:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.