Stability-driven Contact Reconstruction From Monocular Color Images
- URL: http://arxiv.org/abs/2205.00848v1
- Date: Mon, 2 May 2022 12:23:06 GMT
- Title: Stability-driven Contact Reconstruction From Monocular Color Images
- Authors: Zimeng Zhao, Binghui Zuo, Wei Xie, Yangang Wang
- Abstract summary: Physical contact provides additional constraints for hand-object state reconstruction.
Existing methods optimize the hand-object contact driven by distance threshold or prior from contact-labeled datasets.
Our key idea is to reconstruct the contact pattern directly from monocular images, and then utilize the physical stability criterion in the simulation to optimize it.
- Score: 7.427212296770506
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Physical contact provides additional constraints for hand-object state
reconstruction as well as a basis for further understanding of interaction
affordances. Estimating these severely occluded regions from monocular images
presents a considerable challenge. Existing methods optimize the hand-object
contact driven by distance threshold or prior from contact-labeled datasets.
However, due to the number of subjects and objects involved in these indoor
datasets being limited, the learned contact patterns could not be generalized
easily. Our key idea is to reconstruct the contact pattern directly from
monocular images, and then utilize the physical stability criterion in the
simulation to optimize it. This criterion is defined by the resultant forces
and contact distribution computed by the physics engine.Compared to existing
solutions, our framework can be adapted to more personalized hands and diverse
object shapes. Furthermore, an interaction dataset with extra physical
attributes is created to verify the sim-to-real consistency of our methods.
Through comprehensive evaluations, hand-object contact can be reconstructed
with both accuracy and stability by the proposed framework.
Related papers
- Dynamic Reconstruction of Hand-Object Interaction with Distributed Force-aware Contact Representation [52.36691633451968]
ViTaM-D is a visual-tactile framework for dynamic hand-object interaction reconstruction.
DF-Field is a distributed force-aware contact representation model.
Our results highlight the superior performance of ViTaM-D in both rigid and deformable object reconstruction.
arXiv Detail & Related papers (2024-11-14T16:29:45Z) - NCRF: Neural Contact Radiance Fields for Free-Viewpoint Rendering of
Hand-Object Interaction [19.957593804898064]
We present a novel free-point rendering framework, Neural Contact Radiance Field ( NCRF), to reconstruct hand-object interactions from a sparse set of videos.
We jointly learn these key components where they mutually help and regularize each other with visual and geometric constraints.
Our approach outperforms the current state-of-the-art in terms of both rendering quality and pose estimation accuracy.
arXiv Detail & Related papers (2024-02-08T10:09:12Z) - DeepSimHO: Stable Pose Estimation for Hand-Object Interaction via
Physics Simulation [81.11585774044848]
We present DeepSimHO, a novel deep-learning pipeline that combines forward physics simulation and backward gradient approximation with a neural network.
Our method noticeably improves the stability of the estimation and achieves superior efficiency over test-time optimization.
arXiv Detail & Related papers (2023-10-11T05:34:36Z) - Exploiting Spatial-Temporal Context for Interacting Hand Reconstruction
on Monocular RGB Video [104.69686024776396]
Reconstructing interacting hands from monocular RGB data is a challenging task, as it involves many interfering factors.
Previous works only leverage information from a single RGB image without modeling their physically plausible relation.
In this work, we are dedicated to explicitly exploiting spatial-temporal information to achieve better interacting hand reconstruction.
arXiv Detail & Related papers (2023-08-08T06:16:37Z) - Learning Explicit Contact for Implicit Reconstruction of Hand-held
Objects from Monocular Images [59.49985837246644]
We show how to model contacts in an explicit way to benefit the implicit reconstruction of hand-held objects.
In the first part, we propose a new subtask of directly estimating 3D hand-object contacts from a single image.
In the second part, we introduce a novel method to diffuse estimated contact states from the hand mesh surface to nearby 3D space.
arXiv Detail & Related papers (2023-05-31T17:59:26Z) - Integrated Object Deformation and Contact Patch Estimation from
Visuo-Tactile Feedback [8.420670642409219]
We propose a representation that jointly models object deformations and contact patches from visuo-tactile feedback.
We propose a neural network architecture to learn a NDCF, and train it using simulated data.
We demonstrate that the learned NDCF transfers directly to the real-world without the need for fine-tuning.
arXiv Detail & Related papers (2023-05-23T18:53:24Z) - Visual-Tactile Sensing for In-Hand Object Reconstruction [38.42487660352112]
We propose a visual-tactile in-hand object reconstruction framework textbfVTacO, and extend it to textbfVTacOH for hand-object reconstruction.
A simulation environment, VT-Sim, supports generating hand-object interaction for both rigid and deformable objects.
arXiv Detail & Related papers (2023-03-25T15:16:31Z) - Physical Interaction: Reconstructing Hand-object Interactions with
Physics [17.90852804328213]
The paper proposes a physics-based method to better solve the ambiguities in the reconstruction.
It first proposes a force-based dynamic model of the in-hand object, which recovers the unobserved contacts and also solves for plausible contact forces.
Experiments show that the proposed technique reconstructs both physically plausible and more accurate hand-object interaction.
arXiv Detail & Related papers (2022-09-22T07:41:31Z) - Physically Plausible Pose Refinement using Fully Differentiable Forces [68.8204255655161]
We propose an end-to-end differentiable model that refines pose estimates by learning the forces experienced by the object.
By matching the learned net force to an estimate of net force based on finite differences of position, this model is able to find forces that accurately describe the movement of the object.
We show this model successfully corrects poses and finds contact maps that better match the ground truth, despite not using any RGB or depth image data.
arXiv Detail & Related papers (2021-05-17T23:33:04Z) - Leveraging Photometric Consistency over Time for Sparsely Supervised
Hand-Object Reconstruction [118.21363599332493]
We present a method to leverage photometric consistency across time when annotations are only available for a sparse subset of frames in a video.
Our model is trained end-to-end on color images to jointly reconstruct hands and objects in 3D by inferring their poses.
We achieve state-of-the-art results on 3D hand-object reconstruction benchmarks and demonstrate that our approach allows us to improve the pose estimation accuracy.
arXiv Detail & Related papers (2020-04-28T12:03:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.