Learning from Sparse Demonstrations
- URL: http://arxiv.org/abs/2008.02159v3
- Date: Mon, 8 Aug 2022 21:51:08 GMT
- Title: Learning from Sparse Demonstrations
- Authors: Wanxin Jin, Todd D. Murphey, Dana Kuli\'c, Neta Ezer, Shaoshuai Mou
- Abstract summary: The paper develops the method of Continuous Pontryagin Differentiable Programming (Continuous PDP), which enables a robot learn an objective function from a few demonstrated examples.
The method finds an objective function and a time-warping function such that the robot's resulting trajectorys sequentially follow the trajectorys with minimal discrepancy loss.
The method is first evaluated on a simulated robot arm and then applied to a 6-DoF quadrotor to learn an objective function for motion planning in unmodeled environments.
- Score: 17.24236148404065
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper develops the method of Continuous Pontryagin Differentiable
Programming (Continuous PDP), which enables a robot to learn an objective
function from a few sparsely demonstrated keyframes. The keyframes, labeled
with some time stamps, are the desired task-space outputs, which a robot is
expected to follow sequentially. The time stamps of the keyframes can be
different from the time of the robot's actual execution. The method jointly
finds an objective function and a time-warping function such that the robot's
resulting trajectory sequentially follows the keyframes with minimal
discrepancy loss. The Continuous PDP minimizes the discrepancy loss using
projected gradient descent, by efficiently solving the gradient of the robot
trajectory with respect to the unknown parameters. The method is first
evaluated on a simulated robot arm and then applied to a 6-DoF quadrotor to
learn an objective function for motion planning in unmodeled environments. The
results show the efficiency of the method, its ability to handle time
misalignment between keyframes and robot execution, and the generalization of
objective learning into unseen motion conditions.
Related papers
- ReKep: Spatio-Temporal Reasoning of Relational Keypoint Constraints for Robotic Manipulation [31.211870350260703]
Keypoint Constraints (ReKep) is a visually-grounded representation for constraints in robotic manipulation.
We present system implementations on a wheeled single-arm platform and a stationary dual-arm platform.
arXiv Detail & Related papers (2024-09-03T06:45:22Z) - Affordance-based Robot Manipulation with Flow Matching [6.863932324631107]
Our framework seamlessly unifies affordance model learning and trajectory generation with flow matching for robot manipulation.
Our evaluation highlights that the proposed prompt tuning method for learning manipulation affordance with language prompter achieves competitive performance.
Our framework seamlessly unifies affordance model learning and trajectory generation with flow matching for robot manipulation.
arXiv Detail & Related papers (2024-09-02T09:11:28Z) - RobotKeyframing: Learning Locomotion with High-Level Objectives via Mixture of Dense and Sparse Rewards [15.79235618199162]
This paper presents a novel learning-based control framework for legged robots.
It incorporates high-level objectives in natural locomotion for legged robots.
It uses a multi-critic reinforcement learning algorithm to handle the mixture of dense and sparse rewards.
arXiv Detail & Related papers (2024-07-16T10:15:35Z) - DiffGen: Robot Demonstration Generation via Differentiable Physics Simulation, Differentiable Rendering, and Vision-Language Model [72.66465487508556]
DiffGen is a novel framework that integrates differentiable physics simulation, differentiable rendering, and a vision-language model.
It can generate realistic robot demonstrations by minimizing the distance between the embedding of the language instruction and the embedding of the simulated observation.
Experiments demonstrate that with DiffGen, we could efficiently and effectively generate robot data with minimal human effort or training time.
arXiv Detail & Related papers (2024-05-12T15:38:17Z) - Unsupervised Learning of Effective Actions in Robotics [0.9374652839580183]
Current state-of-the-art action representations in robotics lack proper effect-driven learning of the robot's actions.
We propose an unsupervised algorithm to discretize a continuous motion space and generate "action prototypes"
We evaluate our method on a simulated stair-climbing reinforcement learning task.
arXiv Detail & Related papers (2024-04-03T13:28:52Z) - Distributional Instance Segmentation: Modeling Uncertainty and High
Confidence Predictions with Latent-MaskRCNN [77.0623472106488]
In this paper, we explore a class of distributional instance segmentation models using latent codes.
For robotic picking applications, we propose a confidence mask method to achieve the high precision necessary.
We show that our method can significantly reduce critical errors in robotic systems, including our newly released dataset of ambiguous scenes.
arXiv Detail & Related papers (2023-05-03T05:57:29Z) - Leveraging Sequentiality in Reinforcement Learning from a Single
Demonstration [68.94506047556412]
We propose to leverage a sequential bias to learn control policies for complex robotic tasks using a single demonstration.
We show that DCIL-II can solve with unprecedented sample efficiency some challenging simulated tasks such as humanoid locomotion and stand-up.
arXiv Detail & Related papers (2022-11-09T10:28:40Z) - Domain Adaptive Robotic Gesture Recognition with Unsupervised
Kinematic-Visual Data Alignment [60.31418655784291]
We propose a novel unsupervised domain adaptation framework which can simultaneously transfer multi-modality knowledge, i.e., both kinematic and visual data, from simulator to real robot.
It remedies the domain gap with enhanced transferable features by using temporal cues in videos, and inherent correlations in multi-modal towards recognizing gesture.
Results show that our approach recovers the performance with great improvement gains, up to 12.91% in ACC and 20.16% in F1score without using any annotations in real robot.
arXiv Detail & Related papers (2021-03-06T09:10:03Z) - Online Body Schema Adaptation through Cost-Sensitive Active Learning [63.84207660737483]
The work was implemented in a simulation environment, using the 7DoF arm of the iCub robot simulator.
A cost-sensitive active learning approach is used to select optimal joint configurations.
The results show cost-sensitive active learning has similar accuracy to the standard active learning approach, while reducing in about half the executed movement.
arXiv Detail & Related papers (2021-01-26T16:01:02Z) - Pose Estimation for Robot Manipulators via Keypoint Optimization and
Sim-to-Real Transfer [10.369766652751169]
Keypoint detection is an essential building block for many robotic applications.
Deep learning methods have the ability to detect user-defined keypoints in a marker-less manner.
We propose a new and autonomous way to define the keypoint locations that overcomes these challenges.
arXiv Detail & Related papers (2020-10-15T22:38:37Z) - Thinking While Moving: Deep Reinforcement Learning with Concurrent
Control [122.49572467292293]
We study reinforcement learning in settings where sampling an action from the policy must be done concurrently with the time evolution of the controlled system.
Much like a person or an animal, the robot must think and move at the same time, deciding on its next action before the previous one has completed.
arXiv Detail & Related papers (2020-04-13T17:49:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.