Reconstructing Hand-Held Objects from Monocular Video
- URL: http://arxiv.org/abs/2211.16835v1
- Date: Wed, 30 Nov 2022 09:14:58 GMT
- Title: Reconstructing Hand-Held Objects from Monocular Video
- Authors: Di Huang, Xiaopeng Ji, Xingyi He, Jiaming Sun, Tong He, Qing Shuai,
Wanli Ouyang, Xiaowei Zhou
- Abstract summary: This paper presents an approach that reconstructs a hand-held object from a monocular video.
In contrast to many recent methods that directly predict object geometry by a trained network, the proposed approach does not require any learned prior to the object.
- Score: 95.06750686508315
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents an approach that reconstructs a hand-held object from a
monocular video. In contrast to many recent methods that directly predict
object geometry by a trained network, the proposed approach does not require
any learned prior about the object and is able to recover more accurate and
detailed object geometry. The key idea is that the hand motion naturally
provides multiple views of the object and the motion can be reliably estimated
by a hand pose tracker. Then, the object geometry can be recovered by solving a
multi-view reconstruction problem. We devise an implicit neural
representation-based method to solve the reconstruction problem and address the
issues of imprecise hand pose estimation, relative hand-object motion, and
insufficient geometry optimization for small objects. We also provide a newly
collected dataset with 3D ground truth to validate the proposed approach.
Related papers
- Sparse multi-view hand-object reconstruction for unseen environments [31.604141859402187]
We train our model on a synthetic hand-object dataset and evaluate directly on a real world recorded hand-object dataset with unseen objects.
We show that while reconstruction of unseen hands and objects from RGB is challenging, additional views can help improve the reconstruction quality.
arXiv Detail & Related papers (2024-05-02T15:01:25Z) - ShapeGraFormer: GraFormer-Based Network for Hand-Object Reconstruction from a Single Depth Map [11.874184782686532]
We propose the first approach for realistic 3D hand-object shape and pose reconstruction from a single depth map.
Our pipeline additionally predicts voxelized hand-object shapes, having a one-to-one mapping to the input voxelized depth.
In addition, we show the impact of adding another GraFormer component that refines the reconstructed shapes based on the hand-object interactions.
arXiv Detail & Related papers (2023-10-18T09:05:57Z) - Learning Explicit Contact for Implicit Reconstruction of Hand-held
Objects from Monocular Images [59.49985837246644]
We show how to model contacts in an explicit way to benefit the implicit reconstruction of hand-held objects.
In the first part, we propose a new subtask of directly estimating 3D hand-object contacts from a single image.
In the second part, we introduce a novel method to diffuse estimated contact states from the hand mesh surface to nearby 3D space.
arXiv Detail & Related papers (2023-05-31T17:59:26Z) - Uncertainty Guided Policy for Active Robotic 3D Reconstruction using
Neural Radiance Fields [82.21033337949757]
This paper introduces a ray-based volumetric uncertainty estimator, which computes the entropy of the weight distribution of the color samples along each ray of the object's implicit neural representation.
We show that it is possible to infer the uncertainty of the underlying 3D geometry given a novel view with the proposed estimator.
We present a next-best-view selection policy guided by the ray-based volumetric uncertainty in neural radiance fields-based representations.
arXiv Detail & Related papers (2022-09-17T21:28:57Z) - What's in your hands? 3D Reconstruction of Generic Objects in Hands [49.12461675219253]
Our work aims to reconstruct hand-held objects given a single RGB image.
In contrast to prior works that typically assume known 3D templates and reduce the problem to 3D pose estimation, our work reconstructs generic hand-held object without knowing their 3D templates.
arXiv Detail & Related papers (2022-04-14T17:59:02Z) - Towards unconstrained joint hand-object reconstruction from RGB videos [81.97694449736414]
Reconstructing hand-object manipulations holds a great potential for robotics and learning from human demonstrations.
We first propose a learning-free fitting approach for hand-object reconstruction which can seamlessly handle two-hand object interactions.
arXiv Detail & Related papers (2021-08-16T12:26:34Z) - A Divide et Impera Approach for 3D Shape Reconstruction from Multiple
Views [49.03830902235915]
Estimating the 3D shape of an object from a single or multiple images has gained popularity thanks to the recent breakthroughs powered by deep learning.
This paper proposes to rely on viewpoint variant reconstructions by merging the visible information from the given views.
To validate the proposed method, we perform a comprehensive evaluation on the ShapeNet reference benchmark in terms of relative pose estimation and 3D shape reconstruction.
arXiv Detail & Related papers (2020-11-17T09:59:32Z) - Reconstruct, Rasterize and Backprop: Dense shape and pose estimation
from a single image [14.9851111159799]
This paper presents a new system to obtain dense object reconstructions along with 6-DoF poses from a single image.
We leverage recent advances in differentiable rendering (in particular, robotics) to close the loop with 3D reconstruction in camera frame.
arXiv Detail & Related papers (2020-04-25T20:53:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.