Contextual Latent-Movements Off-Policy Optimization for Robotic
Manipulation Skills
- URL: http://arxiv.org/abs/2010.13766v3
- Date: Fri, 11 Feb 2022 01:49:11 GMT
- Title: Contextual Latent-Movements Off-Policy Optimization for Robotic
Manipulation Skills
- Authors: Samuele Tosatto, Georgia Chalvatzaki, Jan Peters
- Abstract summary: We propose a novel view on handling the demonstrated trajectories for acquiring low-dimensional, non-linear latent dynamics.
We introduce a new contextual off-policy RL algorithm, named LAtent-Movements Policy Optimization (LAMPO)
LAMPO provides sample-efficient policies against common approaches in literature.
- Score: 41.140532647789456
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Parameterized movement primitives have been extensively used for imitation
learning of robotic tasks. However, the high-dimensionality of the parameter
space hinders the improvement of such primitives in the reinforcement learning
(RL) setting, especially for learning with physical robots. In this paper we
propose a novel view on handling the demonstrated trajectories for acquiring
low-dimensional, non-linear latent dynamics, using mixtures of probabilistic
principal component analyzers (MPPCA) on the movements' parameter space.
Moreover, we introduce a new contextual off-policy RL algorithm, named
LAtent-Movements Policy Optimization (LAMPO). LAMPO can provide gradient
estimates from previous experience using self-normalized importance sampling,
hence, making full use of samples collected in previous learning iterations.
These advantages combined provide a complete framework for sample-efficient
off-policy optimization of movement primitives for robot learning of
high-dimensional manipulation skills. Our experimental results conducted both
in simulation and on a real robot show that LAMPO provides sample-efficient
policies against common approaches in literature.
Related papers
- MotionRL: Align Text-to-Motion Generation to Human Preferences with Multi-Reward Reinforcement Learning [99.09906827676748]
We introduce MotionRL, the first approach to utilize Multi-Reward Reinforcement Learning (RL) for optimizing text-to-motion generation tasks.
Our novel approach uses reinforcement learning to fine-tune the motion generator based on human preferences prior knowledge of the human perception model.
In addition, MotionRL introduces a novel multi-objective optimization strategy to approximate optimality between text adherence, motion quality, and human preferences.
arXiv Detail & Related papers (2024-10-09T03:27:14Z) - Incremental Few-Shot Adaptation for Non-Prehensile Object Manipulation using Parallelizable Physics Simulators [5.483662156126757]
We propose a novel approach for non-prehensile manipulation which iteratively adapts a physics-based dynamics model for model-predictive control.
We adapt the parameters of the model incrementally with a few examples of robot-object interactions.
We evaluate our few-shot adaptation approach in several object pushing experiments in simulation and with a real robot.
arXiv Detail & Related papers (2024-09-20T05:24:25Z) - Navigating the Human Maze: Real-Time Robot Pathfinding with Generative Imitation Learning [0.0]
We introduce goal-conditioned autoregressive models to generate crowd behaviors, capturing intricate interactions among individuals.
The model processes potential robot trajectory samples and predicts the reactions of surrounding individuals, enabling proactive robotic navigation in complex scenarios.
arXiv Detail & Related papers (2024-08-07T14:32:41Z) - Machine Learning Optimized Approach for Parameter Selection in MESHFREE Simulations [0.0]
Meshfree simulation methods are emerging as compelling alternatives to conventional mesh-based approaches.
We provide a comprehensive overview of our research combining Machine Learning (ML) and Fraunhofer's MESHFREE software.
We introduce a novel ML-optimized approach, using active learning, regression trees, and visualization on MESHFREE simulation data.
arXiv Detail & Related papers (2024-03-20T15:29:59Z) - Using Implicit Behavior Cloning and Dynamic Movement Primitive to Facilitate Reinforcement Learning for Robot Motion Planning [3.16488279864227]
Reinforcement learning (RL) for motion planning of robots suffers from low efficiency in terms of slow training speed and poor generalizability.
We propose a novel RL-based framework that uses implicit behavior cloning (IBC) and dynamic movement primitive (DMP) to improve the training speed and generalizability of an off-policy RL agent.
arXiv Detail & Related papers (2023-07-29T19:46:09Z) - A dynamic Bayesian optimized active recommender system for
curiosity-driven Human-in-the-loop automated experiments [8.780395483188242]
We present the development of a new type of human in the loop experimental workflow, via a Bayesian optimized active recommender system (BOARS)
This work shows the utility of human-augmented machine learning approaches for curiosity-driven exploration of systems across experimental domains.
arXiv Detail & Related papers (2023-04-05T14:54:34Z) - Towards Learning Universal Hyperparameter Optimizers with Transformers [57.35920571605559]
We introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction.
Our experiments demonstrate that the OptFormer can imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates.
arXiv Detail & Related papers (2022-05-26T12:51:32Z) - Gradient-Based Trajectory Optimization With Learned Dynamics [80.41791191022139]
We use machine learning techniques to learn a differentiable dynamics model of the system from data.
We show that a neural network can model highly nonlinear behaviors accurately for large time horizons.
In our hardware experiments, we demonstrate that our learned model can represent complex dynamics for both the Spot and Radio-controlled (RC) car.
arXiv Detail & Related papers (2022-04-09T22:07:34Z) - Transformer Inertial Poser: Attention-based Real-time Human Motion
Reconstruction from Sparse IMUs [79.72586714047199]
We propose an attention-based deep learning method to reconstruct full-body motion from six IMU sensors in real-time.
Our method achieves new state-of-the-art results both quantitatively and qualitatively, while being simple to implement and smaller in size.
arXiv Detail & Related papers (2022-03-29T16:24:52Z) - Nonprehensile Riemannian Motion Predictive Control [57.295751294224765]
We introduce a novel Real-to-Sim reward analysis technique to reliably imagine and predict the outcome of taking possible actions for a real robotic platform.
We produce a closed-loop controller to reactively push objects in a continuous action space.
We observe that RMPC is robust in cluttered as well as occluded environments and outperforms the baselines.
arXiv Detail & Related papers (2021-11-15T18:50:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.