Learning Contraction Policies from Offline Data
- URL: http://arxiv.org/abs/2112.05911v1
- Date: Sat, 11 Dec 2021 03:48:51 GMT
- Title: Learning Contraction Policies from Offline Data
- Authors: Navid Rezazadeh and Maxwell Kolarich and Solmaz S. Kia and Negar Mehr
- Abstract summary: We propose a data-driven method for learning convergent control policies from offline data using Contraction theory.
We learn the control policy and its corresponding contraction metric while enforcing contraction.
We evaluate the performance of our proposed framework on simulated robotic goal-reaching tasks.
- Score: 1.5771347525430772
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes a data-driven method for learning convergent control
policies from offline data using Contraction theory. Contraction theory enables
constructing a policy that makes the closed-loop system trajectories inherently
convergent towards a unique trajectory. At the technical level, identifying the
contraction metric, which is the distance metric with respect to which a
robot's trajectories exhibit contraction is often non-trivial. We propose to
jointly learn the control policy and its corresponding contraction metric while
enforcing contraction. To achieve this, we learn an implicit dynamics model of
the robotic system from an offline data set consisting of the robot's state and
input trajectories. Using this learned dynamics model, we propose a data
augmentation algorithm for learning contraction policies. We randomly generate
samples in the state-space and propagate them forward in time through the
learned dynamics model to generate auxiliary sample trajectories. We then learn
both the control policy and the contraction metric such that the distance
between the trajectories from the offline data set and our generated auxiliary
sample trajectories decreases over time. We evaluate the performance of our
proposed framework on simulated robotic goal-reaching tasks and demonstrate
that enforcing contraction results in faster convergence and greater robustness
of the learned policy.
Related papers
- Multi-Agent Dynamic Relational Reasoning for Social Robot Navigation [50.01551945190676]
Social robot navigation can be helpful in various contexts of daily life but requires safe human-robot interactions and efficient trajectory planning.
We propose a systematic relational reasoning approach with explicit inference of the underlying dynamically evolving relational structures.
We demonstrate its effectiveness for multi-agent trajectory prediction and social robot navigation.
arXiv Detail & Related papers (2024-01-22T18:58:22Z) - Let Offline RL Flow: Training Conservative Agents in the Latent Space of
Normalizing Flows [58.762959061522736]
offline reinforcement learning aims to train a policy on a pre-recorded and fixed dataset without any additional environment interactions.
We build upon recent works on learning policies in latent action spaces and use a special form of Normalizing Flows for constructing a generative model.
We evaluate our method on various locomotion and navigation tasks, demonstrating that our approach outperforms recently proposed algorithms.
arXiv Detail & Related papers (2022-11-20T21:57:10Z) - Sample Efficient Dynamics Learning for Symmetrical Legged
Robots:Leveraging Physics Invariance and Geometric Symmetries [14.848950116410231]
This paper proposes a novel approach for learning dynamics leveraging the symmetry in the underlying robotic system.
Existing frameworks that represent all data in vector space fail to consider the structured information of the robot.
arXiv Detail & Related papers (2022-10-13T19:57:46Z) - Learning Policies for Continuous Control via Transition Models [2.831332389089239]
In robot control, moving an arm's end-effector to a target position or along a target trajectory requires accurate forward and inverse models.
We show that by learning the transition (forward) model from interaction, we can use it to drive the learning of an amortized policy.
arXiv Detail & Related papers (2022-09-16T16:23:48Z) - Estimating Link Flows in Road Networks with Synthetic Trajectory Data
Generation: Reinforcement Learning-based Approaches [7.369475193451259]
This paper addresses the problem of estimating link flows in a road network by combining limited traffic volume and vehicle trajectory data.
We propose a novel generative modelling framework, where we formulate the link-to-link movements of a vehicle as a sequential decision-making problem.
To ensure the generated population vehicle trajectories are consistent with the observed traffic volume and trajectory data, two methods based on Inverse Reinforcement Learning and Constrained Reinforcement Learning are proposed.
arXiv Detail & Related papers (2022-06-26T13:14:52Z) - Learning Interactive Driving Policies via Data-driven Simulation [125.97811179463542]
Data-driven simulators promise high data-efficiency for driving policy learning.
Small underlying datasets often lack interesting and challenging edge cases for learning interactive driving.
We propose a simulation method that uses in-painted ado vehicles for learning robust driving policies.
arXiv Detail & Related papers (2021-11-23T20:14:02Z) - Koopman Q-learning: Offline Reinforcement Learning via Symmetries of
Dynamics [29.219095364935885]
offline reinforcement learning leverages large datasets to train policies without interactions with the environment.
Current algorithms over-fit to the training dataset and perform poorly when deployed to out-of-distribution generalizations of the environment.
We learn a Koopman latent representation which allows us to infer symmetries of the system's underlying dynamic.
We empirically evaluate our method on several benchmark offline reinforcement learning tasks and datasets including D4RL, Metaworld and Robosuite.
arXiv Detail & Related papers (2021-11-02T04:32:18Z) - DEALIO: Data-Efficient Adversarial Learning for Imitation from
Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator.
Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms.
This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk.
We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z) - Congestion-aware Multi-agent Trajectory Prediction for Collision
Avoidance [110.63037190641414]
We propose to learn congestion patterns explicitly and devise a novel "Sense--Learn--Reason--Predict" framework.
By decomposing the learning phases into two stages, a "student" can learn contextual cues from a "teacher" while generating collision-free trajectories.
In experiments, we demonstrate that the proposed model is able to generate collision-free trajectory predictions in a synthetic dataset.
arXiv Detail & Related papers (2021-03-26T02:42:33Z) - PLAS: Latent Action Space for Offline Reinforcement Learning [18.63424441772675]
The goal of offline reinforcement learning is to learn a policy from a fixed dataset, without further interactions with the environment.
Existing off-policy algorithms have limited performance on static datasets due to extrapolation errors from out-of-distribution actions.
We demonstrate that our method provides competitive performance consistently across various continuous control tasks and different types of datasets.
arXiv Detail & Related papers (2020-11-14T03:38:38Z) - Risk-Averse MPC via Visual-Inertial Input and Recurrent Networks for
Online Collision Avoidance [95.86944752753564]
We propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties.
Our algorithm combines an object detection pipeline with a recurrent neural network (RNN) which infers the covariance of state estimates.
The robustness of our methods is validated on complex quadruped robot dynamics and can be generally applied to most robotic platforms.
arXiv Detail & Related papers (2020-07-28T07:34:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.