Thinking While Moving: Deep Reinforcement Learning with Concurrent
  Control
        - URL: http://arxiv.org/abs/2004.06089v4
- Date: Sat, 25 Apr 2020 21:19:45 GMT
- Title: Thinking While Moving: Deep Reinforcement Learning with Concurrent
  Control
- Authors: Ted Xiao, Eric Jang, Dmitry Kalashnikov, Sergey Levine, Julian Ibarz,
  Karol Hausman, Alexander Herzog
- Abstract summary: We study reinforcement learning in settings where sampling an action from the policy must be done concurrently with the time evolution of the controlled system.
Much like a person or an animal, the robot must think and move at the same time, deciding on its next action before the previous one has completed.
- Score: 122.49572467292293
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract:   We study reinforcement learning in settings where sampling an action from the
policy must be done concurrently with the time evolution of the controlled
system, such as when a robot must decide on the next action while still
performing the previous action. Much like a person or an animal, the robot must
think and move at the same time, deciding on its next action before the
previous one has completed. In order to develop an algorithmic framework for
such concurrent control problems, we start with a continuous-time formulation
of the Bellman equations, and then discretize them in a way that is aware of
system delays. We instantiate this new class of approximate dynamic programming
methods via a simple architectural extension to existing value-based deep
reinforcement learning algorithms. We evaluate our methods on simulated
benchmark tasks and a large-scale robotic grasping task where the robot must
"think while moving".
 
      
        Related papers
        - Towards Bio-Inspired Robotic Trajectory Planning via Self-Supervised RNN [1.474723404975345]
 Trajectory planning in robotics is understood as generating a sequence of joint configurations that lead a robotic agent from an initial state to the desired final state.<n>Recent advances demonstrate that trajectory planning can also be performed by supervised sequence learning of trajectories.<n>We propose a cognitively inspired self-supervised learning scheme based on a recurrent architecture for building a trajectory model.
 arXiv  Detail & Related papers  (2025-07-02T22:05:58Z)
- KungfuBot: Physics-Based Humanoid Whole-Body Control for Learning   Highly-Dynamic Skills [50.34487144149439]
 This paper presents a physics-based humanoid control framework, aiming to master highly-dynamic human behaviors such as Kungfu and dancing.<n>For motion processing, we design a pipeline to extract, filter out, correct, and retarget motions, while ensuring compliance with physical constraints.<n>For motion imitation, we formulate a bi-level optimization problem to dynamically adjust the tracking accuracy tolerance.<n>In experiments, we train whole-body control policies to imitate a set of highly-dynamic motions.
 arXiv  Detail & Related papers  (2025-06-15T13:58:53Z)
- Action Flow Matching for Continual Robot Learning [57.698553219660376]
 Continual learning in robotics seeks systems that can constantly adapt to changing environments and tasks.
We introduce a generative framework leveraging flow matching for online robot dynamics model alignment.
We find that by transforming the actions themselves rather than exploring with a misaligned model, the robot collects informative data more efficiently.
 arXiv  Detail & Related papers  (2025-04-25T16:26:15Z)
- Simulation-Aided Policy Tuning for Black-Box Robot Learning [47.83474891747279]
 We present a novel black-box policy search algorithm focused on data-efficient policy improvements.
The algorithm learns directly on the robot and treats simulation as an additional information source to speed up the learning process.
We show fast and successful task learning on a robot manipulator with the aid of an imperfect simulator.
 arXiv  Detail & Related papers  (2024-11-21T15:52:23Z)
- Single-Shot Learning of Stable Dynamical Systems for Long-Horizon   Manipulation Tasks [48.54757719504994]
 This paper focuses on improving task success rates while reducing the amount of training data needed.
Our approach introduces a novel method that segments long-horizon demonstrations into discrete steps defined by waypoints and subgoals.
We validate our approach through both simulation and real-world experiments, demonstrating effective transfer from simulation to physical robotic platforms.
 arXiv  Detail & Related papers  (2024-10-01T19:49:56Z)
- Unsupervised Learning of Effective Actions in Robotics [0.9374652839580183]
 Current state-of-the-art action representations in robotics lack proper effect-driven learning of the robot's actions.
We propose an unsupervised algorithm to discretize a continuous motion space and generate "action prototypes"
We evaluate our method on a simulated stair-climbing reinforcement learning task.
 arXiv  Detail & Related papers  (2024-04-03T13:28:52Z)
- RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning   via Generative Simulation [68.70755196744533]
 RoboGen is a generative robotic agent that automatically learns diverse robotic skills at scale via generative simulation.
Our work attempts to extract the extensive and versatile knowledge embedded in large-scale models and transfer them to the field of robotics.
 arXiv  Detail & Related papers  (2023-11-02T17:59:21Z)
- Leveraging Sequentiality in Reinforcement Learning from a Single
  Demonstration [68.94506047556412]
 We propose to leverage a sequential bias to learn control policies for complex robotic tasks using a single demonstration.
We show that DCIL-II can solve with unprecedented sample efficiency some challenging simulated tasks such as humanoid locomotion and stand-up.
 arXiv  Detail & Related papers  (2022-11-09T10:28:40Z)
- Memory-based gaze prediction in deep imitation learning for robot
  manipulation [2.857551605623957]
 The proposed algorithm uses a Transformer-based self-attention architecture for the gaze estimation based on sequential data to implement memory.
The proposed method was evaluated with a real robot multi-object manipulation task that requires memory of the previous states.
 arXiv  Detail & Related papers  (2022-02-10T07:30:08Z)
- Neural Dynamic Policies for End-to-End Sensorimotor Learning [51.24542903398335]
 The current dominant paradigm in sensorimotor control, whether imitation or reinforcement learning, is to train policies directly in raw action spaces.
We propose Neural Dynamic Policies (NDPs) that make predictions in trajectory distribution space.
NDPs outperform the prior state-of-the-art in terms of either efficiency or performance across several robotic control tasks.
 arXiv  Detail & Related papers  (2020-12-04T18:59:32Z)
- DREAM Architecture: a Developmental Approach to Open-Ended Learning in
  Robotics [44.62475518267084]
 We present a developmental cognitive architecture to bootstrap this redescription process stage by stage, build new state representations with appropriate motivations, and transfer the acquired knowledge across domains or tasks or even across robots.
 arXiv  Detail & Related papers  (2020-05-13T09:29:40Z)
- On Simple Reactive Neural Networks for Behaviour-Based Reinforcement
  Learning [5.482532589225552]
 We present a behaviour-based reinforcement learning approach, inspired by Brook's subsumption architecture.
Our working assumption is that a pick and place robotic task can be simplified by leveraging domain knowledge of a robotics developer.
Our approach learns the pick and place task in 8,000 episodes, which represents a drastic reduction in the number of training episodes required by an end-to-end approach.
 arXiv  Detail & Related papers  (2020-01-22T11:49:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.