Related papers: Stability-driven Contact Reconstruction From Monocular Color Images

Stability-driven Contact Reconstruction From Monocular Color Images

URL: http://arxiv.org/abs/2205.00848v1
Date: Mon, 2 May 2022 12:23:06 GMT
Title: Stability-driven Contact Reconstruction From Monocular Color Images
Authors: Zimeng Zhao, Binghui Zuo, Wei Xie, Yangang Wang
Abstract summary: Physical contact provides additional constraints for hand-object state reconstruction. Existing methods optimize the hand-object contact driven by distance threshold or prior from contact-labeled datasets. Our key idea is to reconstruct the contact pattern directly from monocular images, and then utilize the physical stability criterion in the simulation to optimize it.
Score: 7.427212296770506
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Physical contact provides additional constraints for hand-object state reconstruction as well as a basis for further understanding of interaction affordances. Estimating these severely occluded regions from monocular images presents a considerable challenge. Existing methods optimize the hand-object contact driven by distance threshold or prior from contact-labeled datasets. However, due to the number of subjects and objects involved in these indoor datasets being limited, the learned contact patterns could not be generalized easily. Our key idea is to reconstruct the contact pattern directly from monocular images, and then utilize the physical stability criterion in the simulation to optimize it. This criterion is defined by the resultant forces and contact distribution computed by the physics engine.Compared to existing solutions, our framework can be adapted to more personalized hands and diverse object shapes. Furthermore, an interaction dataset with extra physical attributes is created to verify the sim-to-real consistency of our methods. Through comprehensive evaluations, hand-object contact can be reconstructed with both accuracy and stability by the proposed framework.

Related papers

ViTaSCOPE: Visuo-tactile Implicit Representation for In-hand Pose and Extrinsic Contact Estimation [2.140861702387444]
dexterous contact-rich object manipulation demands both object poses and external contact locations.<n>We present ViTaSCOPE: VisuoTac Simultaneous Contact and Object Estimation.
arXiv Detail & Related papers (2025-06-13T21:35:58Z)
Aligning Foundation Model Priors and Diffusion-Based Hand Interactions for Occlusion-Resistant Two-Hand Reconstruction [50.952228546326516]
Two-hand reconstruction from monocular images faces persistent challenges due to complex and dynamic hand postures and occlusions. Existing approaches struggle with such alignment issues, often resulting in misalignment and penetration artifacts. We propose a novel framework that attempts to precisely align hand poses and interactions by integrating foundation model-driven 2D priors with diffusion-based interaction refinement.
arXiv Detail & Related papers (2025-03-22T14:42:27Z)
Dynamic Reconstruction of Hand-Object Interaction with Distributed Force-aware Contact Representation [52.36691633451968]
ViTaM-D is a visual-tactile framework for dynamic hand-object interaction reconstruction. DF-Field is a distributed force-aware contact representation model. Our results highlight the superior performance of ViTaM-D in both rigid and deformable object reconstruction.
arXiv Detail & Related papers (2024-11-14T16:29:45Z)
NCRF: Neural Contact Radiance Fields for Free-Viewpoint Rendering of Hand-Object Interaction [19.957593804898064]
We present a novel free-point rendering framework, Neural Contact Radiance Field ( NCRF), to reconstruct hand-object interactions from a sparse set of videos. We jointly learn these key components where they mutually help and regularize each other with visual and geometric constraints. Our approach outperforms the current state-of-the-art in terms of both rendering quality and pose estimation accuracy.
arXiv Detail & Related papers (2024-02-08T10:09:12Z)
DeepSimHO: Stable Pose Estimation for Hand-Object Interaction via Physics Simulation [81.11585774044848]
We present DeepSimHO, a novel deep-learning pipeline that combines forward physics simulation and backward gradient approximation with a neural network. Our method noticeably improves the stability of the estimation and achieves superior efficiency over test-time optimization.
arXiv Detail & Related papers (2023-10-11T05:34:36Z)
Exploiting Spatial-Temporal Context for Interacting Hand Reconstruction on Monocular RGB Video [104.69686024776396]
Reconstructing interacting hands from monocular RGB data is a challenging task, as it involves many interfering factors. Previous works only leverage information from a single RGB image without modeling their physically plausible relation. In this work, we are dedicated to explicitly exploiting spatial-temporal information to achieve better interacting hand reconstruction.
arXiv Detail & Related papers (2023-08-08T06:16:37Z)
Learning Explicit Contact for Implicit Reconstruction of Hand-held Objects from Monocular Images [59.49985837246644]
We show how to model contacts in an explicit way to benefit the implicit reconstruction of hand-held objects. In the first part, we propose a new subtask of directly estimating 3D hand-object contacts from a single image. In the second part, we introduce a novel method to diffuse estimated contact states from the hand mesh surface to nearby 3D space.
arXiv Detail & Related papers (2023-05-31T17:59:26Z)
Integrated Object Deformation and Contact Patch Estimation from Visuo-Tactile Feedback [8.420670642409219]
We propose a representation that jointly models object deformations and contact patches from visuo-tactile feedback. We propose a neural network architecture to learn a NDCF, and train it using simulated data. We demonstrate that the learned NDCF transfers directly to the real-world without the need for fine-tuning.
arXiv Detail & Related papers (2023-05-23T18:53:24Z)
Visual-Tactile Sensing for In-Hand Object Reconstruction [38.42487660352112]
We propose a visual-tactile in-hand object reconstruction framework textbfVTacO, and extend it to textbfVTacOH for hand-object reconstruction. A simulation environment, VT-Sim, supports generating hand-object interaction for both rigid and deformable objects.
arXiv Detail & Related papers (2023-03-25T15:16:31Z)
Physical Interaction: Reconstructing Hand-object Interactions with Physics [17.90852804328213]
The paper proposes a physics-based method to better solve the ambiguities in the reconstruction. It first proposes a force-based dynamic model of the in-hand object, which recovers the unobserved contacts and also solves for plausible contact forces. Experiments show that the proposed technique reconstructs both physically plausible and more accurate hand-object interaction.
arXiv Detail & Related papers (2022-09-22T07:41:31Z)
Physically Plausible Pose Refinement using Fully Differentiable Forces [68.8204255655161]
We propose an end-to-end differentiable model that refines pose estimates by learning the forces experienced by the object. By matching the learned net force to an estimate of net force based on finite differences of position, this model is able to find forces that accurately describe the movement of the object. We show this model successfully corrects poses and finds contact maps that better match the ground truth, despite not using any RGB or depth image data.
arXiv Detail & Related papers (2021-05-17T23:33:04Z)
Leveraging Photometric Consistency over Time for Sparsely Supervised Hand-Object Reconstruction [118.21363599332493]
We present a method to leverage photometric consistency across time when annotations are only available for a sparse subset of frames in a video. Our model is trained end-to-end on color images to jointly reconstruct hands and objects in 3D by inferring their poses. We achieve state-of-the-art results on 3D hand-object reconstruction benchmarks and demonstrate that our approach allows us to improve the pose estimation accuracy.
arXiv Detail & Related papers (2020-04-28T12:03:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.