Redefining Data Pairing for Motion Retargeting Leveraging a Human Body Prior
- URL: http://arxiv.org/abs/2409.13208v3
- Date: Tue, 1 Oct 2024 06:42:29 GMT
- Title: Redefining Data Pairing for Motion Retargeting Leveraging a Human Body Prior
- Authors: Xiyana Figuera, Soogeun Park, Hyemin Ahn,
- Abstract summary: MR HuBo(Motion Retargeting leveraging a HUman BOdy prior) is a cost-effective and convenient method to collect high-quality upper body paired robot, human> pose data.
We also present a two-stage motion neural network that can be trained via supervised learning on a large amount of paired data.
- Score: 4.5409191511532505
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose MR HuBo(Motion Retargeting leveraging a HUman BOdy prior), a cost-effective and convenient method to collect high-quality upper body paired <robot, human> pose data, which is essential for data-driven motion retargeting methods. Unlike existing approaches which collect <robot, human> pose data by converting human MoCap poses into robot poses, our method goes in reverse. We first sample diverse random robot poses, and then convert them into human poses. However, since random robot poses can result in extreme and infeasible human poses, we propose an additional technique to sort out extreme poses by exploiting a human body prior trained from a large amount of human pose data. Our data collection method can be used for any humanoid robots, if one designs or optimizes the system's hyperparameters which include a size scale factor and the joint angle ranges for sampling. In addition to this data collection method, we also present a two-stage motion retargeting neural network that can be trained via supervised learning on a large amount of paired data. Compared to other learning-based methods trained via unsupervised learning, we found that our deep neural network trained with ample high-quality paired data achieved notable performance. Our experiments also show that our data filtering method yields better retargeting results than training the model with raw and noisy data. Our code and video results are available on https://sites.google.com/view/mr-hubo/
Related papers
- Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation [16.809190349155525]
Recent works have turned to large-scale pre-training using human data.
morphological differences between humans and robots introduce a significant human-robot domain discrepancy.
We propose a novel adaptation paradigm that utilizes readily available paired human-robot video data to bridge the discrepancy.
arXiv Detail & Related papers (2024-06-20T11:57:46Z) - Learning Human Action Recognition Representations Without Real Humans [66.61527869763819]
We present a benchmark that leverages real-world videos with humans removed and synthetic data containing virtual humans to pre-train a model.
We then evaluate the transferability of the representation learned on this data to a diverse set of downstream action recognition benchmarks.
Our approach outperforms previous baselines by up to 5%.
arXiv Detail & Related papers (2023-11-10T18:38:14Z) - Scaling Robot Learning with Semantically Imagined Experience [21.361979238427722]
Recent advances in robot learning have shown promise in enabling robots to perform manipulation tasks.
One of the key contributing factors to this progress is the scale of robot data used to train the models.
We propose an alternative route and leverage text-to-image foundation models widely used in computer vision and natural language processing.
arXiv Detail & Related papers (2023-02-22T18:47:51Z) - Automatically Prepare Training Data for YOLO Using Robotic In-Hand
Observation and Synthesis [14.034128227585143]
We propose combining robotic in-hand observation and data synthesis to enlarge the limited data set collected by the robot.
The collected and synthetic images are combined to train a deep detection neural network.
The results showed that combined observation and synthetic images led to comparable performance to manual data preparation.
arXiv Detail & Related papers (2023-01-04T04:20:08Z) - Learning Reward Functions for Robotic Manipulation by Observing Humans [92.30657414416527]
We use unlabeled videos of humans solving a wide range of manipulation tasks to learn a task-agnostic reward function for robotic manipulation policies.
The learned rewards are based on distances to a goal in an embedding space learned using a time-contrastive objective.
arXiv Detail & Related papers (2022-11-16T16:26:48Z) - Model Predictive Control for Fluid Human-to-Robot Handovers [50.72520769938633]
Planning motions that take human comfort into account is not a part of the human-robot handover process.
We propose to generate smooth motions via an efficient model-predictive control framework.
We conduct human-to-robot handover experiments on a diverse set of objects with several users.
arXiv Detail & Related papers (2022-03-31T23:08:20Z) - Where is my hand? Deep hand segmentation for visual self-recognition in
humanoid robots [129.46920552019247]
We propose the use of a Convolution Neural Network (CNN) to segment the robot hand from an image in an egocentric view.
We fine-tuned the Mask-RCNN network for the specific task of segmenting the hand of the humanoid robot Vizzy.
arXiv Detail & Related papers (2021-02-09T10:34:32Z) - Cascaded deep monocular 3D human pose estimation with evolutionary
training data [76.3478675752847]
Deep representation learning has achieved remarkable accuracy for monocular 3D human pose estimation.
This paper proposes a novel data augmentation method that is scalable for massive amount of training data.
Our method synthesizes unseen 3D human skeletons based on a hierarchical human representation and synthesizings inspired by prior knowledge.
arXiv Detail & Related papers (2020-06-14T03:09:52Z) - Learning Predictive Models From Observation and Interaction [137.77887825854768]
Learning predictive models from interaction with the world allows an agent, such as a robot, to learn about how the world works.
However, learning a model that captures the dynamics of complex skills represents a major challenge.
We propose a method to augment the training set with observational data of other agents, such as humans.
arXiv Detail & Related papers (2019-12-30T01:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.