TempCLR: Reconstructing Hands via Time-Coherent Contrastive Learning
- URL: http://arxiv.org/abs/2209.00489v1
- Date: Thu, 1 Sep 2022 14:19:05 GMT
- Title: TempCLR: Reconstructing Hands via Time-Coherent Contrastive Learning
- Authors: Andrea Ziani, Zicong Fan, Muhammed Kocabas, Sammy Christen, Otmar
Hilliges
- Abstract summary: We introduce TempCLR, a new time-coherent contrastive learning approach for the structured regression task of 3D hand reconstruction.
Our framework considers temporal consistency in its augmentation scheme, and accounts for the differences of hand poses along the temporal direction.
Our approach improves the performance of fully-supervised hand reconstruction methods by 15.9% and 7.6% in PA-V2V on the HO-3D and FreiHAND datasets respectively.
- Score: 30.823358555054856
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce TempCLR, a new time-coherent contrastive learning approach for
the structured regression task of 3D hand reconstruction. Unlike previous
time-contrastive methods for hand pose estimation, our framework considers
temporal consistency in its augmentation scheme, and accounts for the
differences of hand poses along the temporal direction. Our data-driven method
leverages unlabelled videos and a standard CNN, without relying on synthetic
data, pseudo-labels, or specialized architectures. Our approach improves the
performance of fully-supervised hand reconstruction methods by 15.9% and 7.6%
in PA-V2V on the HO-3D and FreiHAND datasets respectively, thus establishing
new state-of-the-art performance. Finally, we demonstrate that our approach
produces smoother hand reconstructions through time, and is more robust to
heavy occlusions compared to the previous state-of-the-art which we show
quantitatively and qualitatively. Our code and models will be available at
https://eth-ait.github.io/tempclr.
Related papers
- Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion [57.232688209606515]
We present HTCL, a novel Temporal Temporal Context Learning paradigm for improving camera-based semantic scene completion.
Our method ranks $1st$ on the Semantic KITTI benchmark and even surpasses LiDAR-based methods in terms of mIoU.
arXiv Detail & Related papers (2024-07-02T09:11:17Z) - HandBooster: Boosting 3D Hand-Mesh Reconstruction by Conditional Synthesis and Sampling of Hand-Object Interactions [68.28684509445529]
We present HandBooster, a new approach to uplift the data diversity and boost the 3D hand-mesh reconstruction performance.
First, we construct versatile content-aware conditions to guide a diffusion model to produce realistic images with diverse hand appearances, poses, views, and backgrounds.
Then, we design a novel condition creator based on our similarity-aware distribution sampling strategies to deliberately find novel and realistic interaction poses that are distinctive from the training set.
arXiv Detail & Related papers (2024-03-27T13:56:08Z) - HiFiHR: Enhancing 3D Hand Reconstruction from a Single Image via
High-Fidelity Texture [40.012406098563204]
We present HiFiHR, a high-fidelity hand reconstruction approach that utilizes render-and-compare in the learning-based framework from a single image.
Experimental results on public benchmarks including FreiHAND and HO-3D demonstrate that our method outperforms the state-of-the-art hand reconstruction methods in texture reconstruction quality.
arXiv Detail & Related papers (2023-08-25T18:48:40Z) - Exploiting Spatial-Temporal Context for Interacting Hand Reconstruction
on Monocular RGB Video [104.69686024776396]
Reconstructing interacting hands from monocular RGB data is a challenging task, as it involves many interfering factors.
Previous works only leverage information from a single RGB image without modeling their physically plausible relation.
In this work, we are dedicated to explicitly exploiting spatial-temporal information to achieve better interacting hand reconstruction.
arXiv Detail & Related papers (2023-08-08T06:16:37Z) - ACR: Attention Collaboration-based Regressor for Arbitrary Two-Hand
Reconstruction [30.073586754012645]
We present ACR (Attention Collaboration-based Regressor), which makes the first attempt to reconstruct hands in arbitrary scenarios.
We evaluate our method on various types of hand reconstruction datasets.
arXiv Detail & Related papers (2023-03-10T14:19:02Z) - SeqHAND:RGB-Sequence-Based 3D Hand Pose and Shape Estimation [48.456638103309544]
3D hand pose estimation based on RGB images has been studied for a long time.
We propose a novel method that generates a synthetic dataset that mimics natural human hand movements.
We show that utilizing temporal information for 3D hand pose estimation significantly enhances general pose estimations.
arXiv Detail & Related papers (2020-07-10T05:11:14Z) - Reference Pose Generation for Long-term Visual Localization via Learned
Features and View Synthesis [88.80710311624101]
We propose a semi-automated approach to generate reference poses based on feature matching between renderings of a 3D model and real images via learned features.
We significantly improve the nighttime reference poses of the popular Aachen Day-Night dataset, showing that state-of-the-art visual localization methods perform better (up to $47%$) than predicted by the original reference poses.
arXiv Detail & Related papers (2020-05-11T15:13:07Z) - Leveraging Photometric Consistency over Time for Sparsely Supervised
Hand-Object Reconstruction [118.21363599332493]
We present a method to leverage photometric consistency across time when annotations are only available for a sparse subset of frames in a video.
Our model is trained end-to-end on color images to jointly reconstruct hands and objects in 3D by inferring their poses.
We achieve state-of-the-art results on 3D hand-object reconstruction benchmarks and demonstrate that our approach allows us to improve the pose estimation accuracy.
arXiv Detail & Related papers (2020-04-28T12:03:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.