Novel-view Synthesis and Pose Estimation for Hand-Object Interaction
from Sparse Views
- URL: http://arxiv.org/abs/2308.11198v1
- Date: Tue, 22 Aug 2023 05:17:41 GMT
- Title: Novel-view Synthesis and Pose Estimation for Hand-Object Interaction
from Sparse Views
- Authors: Wentian Qu, Zhaopeng Cui, Yinda Zhang, Chenyu Meng, Cuixia Ma,
Xiaoming Deng, Hongan Wang
- Abstract summary: We propose a neural rendering and pose estimation system for hand-object interaction from sparse views.
We first learn the shape and appearance prior knowledge of hands and objects separately with the neural representation.
During the online stage, we design a rendering-based joint model fitting framework to understand the dynamic hand-object interaction.
- Score: 41.50710846018882
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hand-object interaction understanding and the barely addressed novel view
synthesis are highly desired in the immersive communication, whereas it is
challenging due to the high deformation of hand and heavy occlusions between
hand and object. In this paper, we propose a neural rendering and pose
estimation system for hand-object interaction from sparse views, which can also
enable 3D hand-object interaction editing. We share the inspiration from recent
scene understanding work that shows a scene specific model built beforehand can
significantly improve and unblock vision tasks especially when inputs are
sparse, and extend it to the dynamic hand-object interaction scenario and
propose to solve the problem in two stages. We first learn the shape and
appearance prior knowledge of hands and objects separately with the neural
representation at the offline stage. During the online stage, we design a
rendering-based joint model fitting framework to understand the dynamic
hand-object interaction with the pre-built hand and object models as well as
interaction priors, which thereby overcomes penetration and separation issues
between hand and object and also enables novel view synthesis. In order to get
stable contact during the hand-object interaction process in a sequence, we
propose a stable contact loss to make the contact region to be consistent.
Experiments demonstrate that our method outperforms the state-of-the-art
methods. Code and dataset are available in project webpage
https://iscas3dv.github.io/HO-NeRF.
Related papers
- ManiDext: Hand-Object Manipulation Synthesis via Continuous Correspondence Embeddings and Residual-Guided Diffusion [36.9457697304841]
ManiDext is a unified hierarchical diffusion-based framework for generating hand manipulation and grasp poses.
Our key insight is that accurately modeling the contact correspondences between objects and hands during interactions is crucial.
Our framework first generates contact maps and correspondence embeddings on the object's surface.
Based on these fine-grained correspondences, we introduce a novel approach that integrates the iterative refinement process into the diffusion process.
arXiv Detail & Related papers (2024-09-14T04:28:44Z) - DiffH2O: Diffusion-Based Synthesis of Hand-Object Interactions from Textual Descriptions [15.417836855005087]
We propose DiffH2O, a novel method to synthesize realistic, one or two-handed object interactions.
We decompose the task into a grasping stage and a text-based interaction stage.
In the grasping stage, the model only generates hand motions, whereas in the interaction phase both hand and object poses are synthesized.
arXiv Detail & Related papers (2024-03-26T16:06:42Z) - NCRF: Neural Contact Radiance Fields for Free-Viewpoint Rendering of
Hand-Object Interaction [19.957593804898064]
We present a novel free-point rendering framework, Neural Contact Radiance Field ( NCRF), to reconstruct hand-object interactions from a sparse set of videos.
We jointly learn these key components where they mutually help and regularize each other with visual and geometric constraints.
Our approach outperforms the current state-of-the-art in terms of both rendering quality and pose estimation accuracy.
arXiv Detail & Related papers (2024-02-08T10:09:12Z) - Hand-Centric Motion Refinement for 3D Hand-Object Interaction via
Hierarchical Spatial-Temporal Modeling [18.128376292350836]
We propose a data-driven method for coarse hand motion refinement.
First, we design a hand-centric representation to describe the dynamic spatial-temporal relation between hands and objects.
Second, to capture the dynamic clues of hand-object interaction, we propose a new architecture.
arXiv Detail & Related papers (2024-01-29T09:17:51Z) - Controllable Human-Object Interaction Synthesis [77.56877961681462]
We propose Controllable Human-Object Interaction Synthesis (CHOIS) to generate synchronized object motion and human motion in 3D scenes.
Here, language descriptions inform style and intent, and waypoints, which can be effectively extracted from high-level planning, ground the motion in the scene.
Our module seamlessly integrates with a path planning module, enabling the generation of long-term interactions in 3D environments.
arXiv Detail & Related papers (2023-12-06T21:14:20Z) - GRIP: Generating Interaction Poses Using Spatial Cues and Latent Consistency [57.9920824261925]
Hands are dexterous and highly versatile manipulators that are central to how humans interact with objects and their environment.
modeling realistic hand-object interactions is critical for applications in computer graphics, computer vision, and mixed reality.
GRIP is a learning-based method that takes as input the 3D motion of the body and the object, and synthesizes realistic motion for both hands before, during, and after object interaction.
arXiv Detail & Related papers (2023-08-22T17:59:51Z) - Learning Explicit Contact for Implicit Reconstruction of Hand-held
Objects from Monocular Images [59.49985837246644]
We show how to model contacts in an explicit way to benefit the implicit reconstruction of hand-held objects.
In the first part, we propose a new subtask of directly estimating 3D hand-object contacts from a single image.
In the second part, we introduce a novel method to diffuse estimated contact states from the hand mesh surface to nearby 3D space.
arXiv Detail & Related papers (2023-05-31T17:59:26Z) - HMDO: Markerless Multi-view Hand Manipulation Capture with Deformable
Objects [8.711239906965893]
HMDO is the first markerless deformable interaction dataset recording interactive motions of the hands and deformable objects.
The proposed method can reconstruct interactive motions of hands and deformable objects with high quality.
arXiv Detail & Related papers (2023-01-18T16:55:15Z) - Collaborative Learning for Hand and Object Reconstruction with
Attention-guided Graph Convolution [49.10497573378427]
Estimating the pose and shape of hands and objects under interaction finds numerous applications including augmented and virtual reality.
Our algorithm is optimisation to object models, and it learns the physical rules governing hand-object interaction.
Experiments using four widely-used benchmarks show that our framework achieves beyond state-of-the-art accuracy in 3D pose estimation, as well as recovers dense 3D hand and object shapes.
arXiv Detail & Related papers (2022-04-27T17:00:54Z) - Dynamic Modeling of Hand-Object Interactions via Tactile Sensing [133.52375730875696]
In this work, we employ a high-resolution tactile glove to perform four different interactive activities on a diversified set of objects.
We build our model on a cross-modal learning framework and generate the labels using a visual processing pipeline to supervise the tactile model.
This work takes a step on dynamics modeling in hand-object interactions from dense tactile sensing.
arXiv Detail & Related papers (2021-09-09T16:04:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.