Learning High-DOF Reaching-and-Grasping via Dynamic Representation of
Gripper-Object Interaction
- URL: http://arxiv.org/abs/2204.13998v1
- Date: Sun, 3 Apr 2022 07:03:54 GMT
- Title: Learning High-DOF Reaching-and-Grasping via Dynamic Representation of
Gripper-Object Interaction
- Authors: Qijin She, Ruizhen Hu, Juzhan Xu, Min Liu, Kai Xu, and Hui Huang
- Abstract summary: We propose an effective representation of grasping state characterizing the spatial interaction between the gripper and the target object.
IBS is surprisingly effective as a state representation since it well informs the fine-grained control of each finger with spatial relation against the target object.
Experiments show that it generates high-quality dexterous grasp for complex shapes with smooth grasping motions.
- Score: 21.03434784990944
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We approach the problem of high-DOF reaching-and-grasping via learning joint
planning of grasp and motion with deep reinforcement learning. To resolve the
sample efficiency issue in learning the high-dimensional and complex control of
dexterous grasping, we propose an effective representation of grasping state
characterizing the spatial interaction between the gripper and the target
object. To represent gripper-object interaction, we adopt Interaction Bisector
Surface (IBS) which is the Voronoi diagram between two close by 3D geometric
objects and has been successfully applied in characterizing spatial relations
between 3D objects. We found that IBS is surprisingly effective as a state
representation since it well informs the fine-grained control of each finger
with spatial relation against the target object. This novel grasp
representation, together with several technical contributions including a fast
IBS approximation, a novel vector-based reward and an effective training
strategy, facilitate learning a strong control model of high-DOF grasping with
good sample efficiency, dynamic adaptability, and cross-category generality.
Experiments show that it generates high-quality dexterous grasp for complex
shapes with smooth grasping motions.
Related papers
- Visual-Geometric Collaborative Guidance for Affordance Learning [63.038406948791454]
We propose a visual-geometric collaborative guided affordance learning network that incorporates visual and geometric cues.
Our method outperforms the representative models regarding objective metrics and visual quality.
arXiv Detail & Related papers (2024-10-15T07:35:51Z) - Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - GEARS: Local Geometry-aware Hand-object Interaction Synthesis [38.75942505771009]
We introduce a novel joint-centered sensor designed to reason about local object geometry near potential interaction regions.
As an important step towards mitigating the learning complexity, we transform the points from global frame to template hand frame and use a shared module to process sensor features of each individual joint.
This is followed by a perceptual-temporal transformer network aimed at capturing correlation among the joints in different dimensions.
arXiv Detail & Related papers (2024-04-02T09:18:52Z) - Gaze-guided Hand-Object Interaction Synthesis: Dataset and Method [63.49140028965778]
We present GazeHOI, the first dataset to capture simultaneous 3D modeling of gaze, hand, and object interactions.
To tackle these issues, we propose a stacked gaze-guided hand-object interaction diffusion model, named GHO-Diffusion.
We also introduce HOI-Manifold Guidance during the sampling stage of GHO-Diffusion, enabling fine-grained control over generated motions.
arXiv Detail & Related papers (2024-03-24T14:24:13Z) - InterDiff: Generating 3D Human-Object Interactions with Physics-Informed
Diffusion [29.25063155767897]
This paper addresses a novel task of anticipating 3D human-object interactions (HOIs)
Our task is significantly more challenging, as it requires modeling dynamic objects with various shapes, capturing whole-body motion, and ensuring physically valid interactions.
Experiments on multiple human-object interaction datasets demonstrate the effectiveness of our method for this task, capable of producing realistic, vivid, and remarkably long-term 3D HOI predictions.
arXiv Detail & Related papers (2023-08-31T17:59:08Z) - Realistic Full-Body Tracking from Sparse Observations via Joint-Level
Modeling [13.284947022380404]
We propose a two-stage framework that can obtain accurate and smooth full-body motions with three tracking signals of head and hands only.
Our framework explicitly models the joint-level features in the first stage and utilizes them astemporal tokens for alternating spatial and temporal transformer blocks to capture joint-level correlations in the second stage.
With extensive experiments on the AMASS motion dataset and real-captured data, we show our proposed method can achieve more accurate and smooth motion compared to existing approaches.
arXiv Detail & Related papers (2023-08-17T08:27:55Z) - Learning Scene Flow With Skeleton Guidance For 3D Action Recognition [1.5954459915735735]
This work demonstrates the use of 3D flow sequence by a deeptemporal model for 3D action recognition.
An extended deep skeleton is also introduced to learn the most discriminant action motion dynamics.
A late fusion scheme is adopted between the two models for learning the high level cross-modal correlations.
arXiv Detail & Related papers (2023-06-23T04:14:25Z) - Ret3D: Rethinking Object Relations for Efficient 3D Object Detection in
Driving Scenes [82.4186966781934]
We introduce a simple, efficient, and effective two-stage detector, termed as Ret3D.
At the core of Ret3D is the utilization of novel intra-frame and inter-frame relation modules.
With negligible extra overhead, Ret3D achieves the state-of-the-art performance.
arXiv Detail & Related papers (2022-08-18T03:48:58Z) - Collaborative Learning for Hand and Object Reconstruction with
Attention-guided Graph Convolution [49.10497573378427]
Estimating the pose and shape of hands and objects under interaction finds numerous applications including augmented and virtual reality.
Our algorithm is optimisation to object models, and it learns the physical rules governing hand-object interaction.
Experiments using four widely-used benchmarks show that our framework achieves beyond state-of-the-art accuracy in 3D pose estimation, as well as recovers dense 3D hand and object shapes.
arXiv Detail & Related papers (2022-04-27T17:00:54Z) - Joint Hand-object 3D Reconstruction from a Single Image with
Cross-branch Feature Fusion [78.98074380040838]
We propose to consider hand and object jointly in feature space and explore the reciprocity of the two branches.
We employ an auxiliary depth estimation module to augment the input RGB image with the estimated depth map.
Our approach significantly outperforms existing approaches in terms of the reconstruction accuracy of objects.
arXiv Detail & Related papers (2020-06-28T09:50:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.