Deep NRSfM++: Towards Unsupervised 2D-3D Lifting in the Wild
- URL: http://arxiv.org/abs/2001.10090v2
- Date: Wed, 31 Mar 2021 02:18:27 GMT
- Title: Deep NRSfM++: Towards Unsupervised 2D-3D Lifting in the Wild
- Authors: Chaoyang Wang and Chen-Hsuan Lin and Simon Lucey
- Abstract summary: We present a strategy for improving learning-based NRSfM methods to tackle the above issues.
Our approach, Deep NRSfM++, is state-of-the-art performance across numerous large-scale benchmarks.
- Score: 44.78174845839193
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The recovery of 3D shape and pose from 2D landmarks stemming from a large
ensemble of images can be viewed as a non-rigid structure from motion (NRSfM)
problem. Classical NRSfM approaches, however, are problematic as they rely on
heuristic priors on the 3D structure (e.g. low rank) that do not scale well to
large datasets. Learning-based methods are showing the potential to reconstruct
a much broader set of 3D structures than classical methods -- dramatically
expanding the importance of NRSfM to atemporal unsupervised 2D to 3D lifting.
Hitherto, these learning approaches have not been able to effectively model
perspective cameras or handle missing/occluded points -- limiting their
applicability to in-the-wild datasets. In this paper, we present a generalized
strategy for improving learning-based NRSfM methods to tackle the above issues.
Our approach, Deep NRSfM++, achieves state-of-the-art performance across
numerous large-scale benchmarks, outperforming both classical and
learning-based 2D-3D lifting methods.
Related papers
- FILP-3D: Enhancing 3D Few-shot Class-incremental Learning with
Pre-trained Vision-Language Models [62.663113296987085]
Few-shot class-incremental learning aims to mitigate the catastrophic forgetting issue when a model is incrementally trained on limited data.
We introduce two novel components: the Redundant Feature Eliminator (RFE) and the Spatial Noise Compensator (SNC)
Considering the imbalance in existing 3D datasets, we also propose new evaluation metrics that offer a more nuanced assessment of a 3D FSCIL model.
arXiv Detail & Related papers (2023-12-28T14:52:07Z) - 3D-LFM: Lifting Foundation Model [29.48835001900286]
deep learning has expanded our capability to reconstruct a wide range of object classes.
Our approach harnesses the inherent permutation equivariance transformers to manage varying number points per 3D data instance.
We demonstrate state the art performance across 2D-3D lifting task benchmarks.
arXiv Detail & Related papers (2023-12-19T06:38:18Z) - FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models [67.96827539201071]
We propose a novel test-time optimization approach for 3D scene reconstruction.
Our method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
arXiv Detail & Related papers (2023-08-10T17:55:02Z) - Weakly-supervised Pre-training for 3D Human Pose Estimation via
Perspective Knowledge [36.65402869749077]
We propose a novel method to extract weak 3D information directly from 2D images without 3D pose supervision.
We propose a weakly-supervised pre-training (WSP) strategy to distinguish the depth relationship between two points in an image.
WSP achieves state-of-the-art results on two widely-used benchmarks.
arXiv Detail & Related papers (2022-11-22T03:35:15Z) - Learning Geometry-Guided Depth via Projective Modeling for Monocular 3D Object Detection [70.71934539556916]
We learn geometry-guided depth estimation with projective modeling to advance monocular 3D object detection.
Specifically, a principled geometry formula with projective modeling of 2D and 3D depth predictions in the monocular 3D object detection network is devised.
Our method remarkably improves the detection performance of the state-of-the-art monocular-based method without extra data by 2.80% on the moderate test setting.
arXiv Detail & Related papers (2021-07-29T12:30:39Z) - Secrets of 3D Implicit Object Shape Reconstruction in the Wild [92.5554695397653]
Reconstructing high-fidelity 3D objects from sparse, partial observation is crucial for various applications in computer vision, robotics, and graphics.
Recent neural implicit modeling methods show promising results on synthetic or dense datasets.
But, they perform poorly on real-world data that is sparse and noisy.
This paper analyzes the root cause of such deficient performance of a popular neural implicit model.
arXiv Detail & Related papers (2021-01-18T03:24:48Z) - Synthetic Training for Monocular Human Mesh Recovery [100.38109761268639]
This paper aims to estimate 3D mesh of multiple body parts with large-scale differences from a single RGB image.
The main challenge is lacking training data that have complete 3D annotations of all body parts in 2D images.
We propose a depth-to-scale (D2S) projection to incorporate the depth difference into the projection function to derive per-joint scale variants.
arXiv Detail & Related papers (2020-10-27T03:31:35Z) - Procrustean Regression Networks: Learning 3D Structure of Non-Rigid
Objects from 2D Annotations [42.476537776831314]
We propose a novel framework for training neural networks which is capable of learning 3D information of non-rigid objects.
The proposed framework shows superior reconstruction performance to the state-of-the-art method on the Human 3.6M, 300-VW, and SURREAL datasets.
arXiv Detail & Related papers (2020-07-21T17:29:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.