Stability-driven Contact Reconstruction From Monocular Color Images
- URL: http://arxiv.org/abs/2205.00848v1
- Date: Mon, 2 May 2022 12:23:06 GMT
- Title: Stability-driven Contact Reconstruction From Monocular Color Images
- Authors: Zimeng Zhao, Binghui Zuo, Wei Xie, Yangang Wang
- Abstract summary: Physical contact provides additional constraints for hand-object state reconstruction.
Existing methods optimize the hand-object contact driven by distance threshold or prior from contact-labeled datasets.
Our key idea is to reconstruct the contact pattern directly from monocular images, and then utilize the physical stability criterion in the simulation to optimize it.
- Score: 7.427212296770506
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Physical contact provides additional constraints for hand-object state
reconstruction as well as a basis for further understanding of interaction
affordances. Estimating these severely occluded regions from monocular images
presents a considerable challenge. Existing methods optimize the hand-object
contact driven by distance threshold or prior from contact-labeled datasets.
However, due to the number of subjects and objects involved in these indoor
datasets being limited, the learned contact patterns could not be generalized
easily. Our key idea is to reconstruct the contact pattern directly from
monocular images, and then utilize the physical stability criterion in the
simulation to optimize it. This criterion is defined by the resultant forces
and contact distribution computed by the physics engine.Compared to existing
solutions, our framework can be adapted to more personalized hands and diverse
object shapes. Furthermore, an interaction dataset with extra physical
attributes is created to verify the sim-to-real consistency of our methods.
Through comprehensive evaluations, hand-object contact can be reconstructed
with both accuracy and stability by the proposed framework.
Related papers
- PhyRecon: Physically Plausible Neural Scene Reconstruction [81.73129450090684]
We introduce PHYRECON, the first approach to leverage both differentiable rendering and differentiable physics simulation to learn implicit surface representations.
Central to this design is an efficient transformation between SDF-based implicit representations and explicit surface points.
Our results also exhibit superior physical stability in physical simulators, with at least a 40% improvement across all datasets.
arXiv Detail & Related papers (2024-04-25T15:06:58Z) - NCRF: Neural Contact Radiance Fields for Free-Viewpoint Rendering of
Hand-Object Interaction [19.957593804898064]
We present a novel free-point rendering framework, Neural Contact Radiance Field ( NCRF), to reconstruct hand-object interactions from a sparse set of videos.
We jointly learn these key components where they mutually help and regularize each other with visual and geometric constraints.
Our approach outperforms the current state-of-the-art in terms of both rendering quality and pose estimation accuracy.
arXiv Detail & Related papers (2024-02-08T10:09:12Z) - DeepSimHO: Stable Pose Estimation for Hand-Object Interaction via
Physics Simulation [81.11585774044848]
We present DeepSimHO, a novel deep-learning pipeline that combines forward physics simulation and backward gradient approximation with a neural network.
Our method noticeably improves the stability of the estimation and achieves superior efficiency over test-time optimization.
arXiv Detail & Related papers (2023-10-11T05:34:36Z) - Nonrigid Object Contact Estimation With Regional Unwrapping Transformer [16.988812837693203]
Acquiring contact patterns between hands and nonrigid objects is a common concern in the vision and robotics community.
Existing learning-based methods focus more on contact with rigid ones from monocular images.
We propose a novel hand-object contact representation called RUPs, which unwraps the roughly estimated hand-object surfaces as multiple high-resolution 2D regional profiles.
arXiv Detail & Related papers (2023-08-27T11:37:26Z) - Exploiting Spatial-Temporal Context for Interacting Hand Reconstruction
on Monocular RGB Video [104.69686024776396]
Reconstructing interacting hands from monocular RGB data is a challenging task, as it involves many interfering factors.
Previous works only leverage information from a single RGB image without modeling their physically plausible relation.
In this work, we are dedicated to explicitly exploiting spatial-temporal information to achieve better interacting hand reconstruction.
arXiv Detail & Related papers (2023-08-08T06:16:37Z) - Learning Explicit Contact for Implicit Reconstruction of Hand-held
Objects from Monocular Images [59.49985837246644]
We show how to model contacts in an explicit way to benefit the implicit reconstruction of hand-held objects.
In the first part, we propose a new subtask of directly estimating 3D hand-object contacts from a single image.
In the second part, we introduce a novel method to diffuse estimated contact states from the hand mesh surface to nearby 3D space.
arXiv Detail & Related papers (2023-05-31T17:59:26Z) - Integrated Object Deformation and Contact Patch Estimation from
Visuo-Tactile Feedback [8.420670642409219]
We propose a representation that jointly models object deformations and contact patches from visuo-tactile feedback.
We propose a neural network architecture to learn a NDCF, and train it using simulated data.
We demonstrate that the learned NDCF transfers directly to the real-world without the need for fine-tuning.
arXiv Detail & Related papers (2023-05-23T18:53:24Z) - Visual-Tactile Sensing for In-Hand Object Reconstruction [38.42487660352112]
We propose a visual-tactile in-hand object reconstruction framework textbfVTacO, and extend it to textbfVTacOH for hand-object reconstruction.
A simulation environment, VT-Sim, supports generating hand-object interaction for both rigid and deformable objects.
arXiv Detail & Related papers (2023-03-25T15:16:31Z) - Physical Interaction: Reconstructing Hand-object Interactions with
Physics [17.90852804328213]
The paper proposes a physics-based method to better solve the ambiguities in the reconstruction.
It first proposes a force-based dynamic model of the in-hand object, which recovers the unobserved contacts and also solves for plausible contact forces.
Experiments show that the proposed technique reconstructs both physically plausible and more accurate hand-object interaction.
arXiv Detail & Related papers (2022-09-22T07:41:31Z) - Physically Plausible Pose Refinement using Fully Differentiable Forces [68.8204255655161]
We propose an end-to-end differentiable model that refines pose estimates by learning the forces experienced by the object.
By matching the learned net force to an estimate of net force based on finite differences of position, this model is able to find forces that accurately describe the movement of the object.
We show this model successfully corrects poses and finds contact maps that better match the ground truth, despite not using any RGB or depth image data.
arXiv Detail & Related papers (2021-05-17T23:33:04Z) - Leveraging Photometric Consistency over Time for Sparsely Supervised
Hand-Object Reconstruction [118.21363599332493]
We present a method to leverage photometric consistency across time when annotations are only available for a sparse subset of frames in a video.
Our model is trained end-to-end on color images to jointly reconstruct hands and objects in 3D by inferring their poses.
We achieve state-of-the-art results on 3D hand-object reconstruction benchmarks and demonstrate that our approach allows us to improve the pose estimation accuracy.
arXiv Detail & Related papers (2020-04-28T12:03:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.