NOPE-SAC: Neural One-Plane RANSAC for Sparse-View Planar 3D
Reconstruction
- URL: http://arxiv.org/abs/2211.16799v2
- Date: Wed, 13 Sep 2023 02:48:16 GMT
- Title: NOPE-SAC: Neural One-Plane RANSAC for Sparse-View Planar 3D
Reconstruction
- Authors: Bin Tan, Nan Xue, Tianfu Wu, Gui-Song Xia
- Abstract summary: This paper studies the challenging two-view 3D reconstruction in a rigorous sparse-view configuration.
We present a novel Neural One-PlanE RANSAC framework that exerts excellent capability to learn one-plane pose hypotheses.
- Score: 41.00845324937751
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper studies the challenging two-view 3D reconstruction in a rigorous
sparse-view configuration, which is suffering from insufficient correspondences
in the input image pairs for camera pose estimation. We present a novel Neural
One-PlanE RANSAC framework (termed NOPE-SAC in short) that exerts excellent
capability to learn one-plane pose hypotheses from 3D plane correspondences.
Building on the top of a siamese plane detection network, our NOPE-SAC first
generates putative plane correspondences with a coarse initial pose. It then
feeds the learned 3D plane parameters of correspondences into shared MLPs to
estimate the one-plane camera pose hypotheses, which are subsequently reweighed
in a RANSAC manner to obtain the final camera pose. Because the neural
one-plane pose minimizes the number of plane correspondences for adaptive pose
hypotheses generation, it enables stable pose voting and reliable pose
refinement in a few plane correspondences for the sparse-view inputs. In the
experiments, we demonstrate that our NOPE-SAC significantly improves the camera
pose estimation for the two-view inputs with severe viewpoint changes, setting
several new state-of-the-art performances on two challenging benchmarks, i.e.,
MatterPort3D and ScanNet, for sparse-view 3D reconstruction. The source code is
released at https://github.com/IceTTTb/NopeSAC for reproducible research.
Related papers
- MonoPlane: Exploiting Monocular Geometric Cues for Generalizable 3D Plane Reconstruction [37.481945507799594]
This paper presents a generalizable 3D plane detection and reconstruction framework named MonoPlane.
We first leverage large-scale pre-trained neural networks to obtain the depth and surface normals from a single image.
These monocular geometric cues are then incorporated into a proximity-guided RANSAC framework to sequentially fit each plane instance.
arXiv Detail & Related papers (2024-11-02T12:15:29Z) - No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images [100.80376573969045]
NoPoSplat is a feed-forward model capable of reconstructing 3D scenes parameterized by 3D Gaussians from multi-view images.
Our model achieves real-time 3D Gaussian reconstruction during inference.
This work makes significant advances in pose-free generalizable 3D reconstruction and demonstrates its applicability to real-world scenarios.
arXiv Detail & Related papers (2024-10-31T17:58:22Z) - PlaneRecTR++: Unified Query Learning for Joint 3D Planar Reconstruction and Pose Estimation [10.982464344805194]
PlaneRecTR++ is a Transformer-based architecture that unifies all sub-tasks related to multi-view reconstruction and pose estimation.
Our proposed unified learning achieves mutual benefits across sub-tasks, obtaining a new state-of-the-art performance on public ScanNetv1, ScanNetv2, NYUv2-Plane, and MatterPort3D datasets.
arXiv Detail & Related papers (2023-07-25T18:28:19Z) - Few-View Object Reconstruction with Unknown Categories and Camera Poses [80.0820650171476]
This work explores reconstructing general real-world objects from a few images without known camera poses or object categories.
The crux of our work is solving two fundamental 3D vision problems -- shape reconstruction and pose estimation.
Our method FORGE predicts 3D features from each view and leverages them in conjunction with the input images to establish cross-view correspondence.
arXiv Detail & Related papers (2022-12-08T18:59:02Z) - Stochastic Modeling for Learnable Human Pose Triangulation [0.7646713951724009]
We propose a modeling framework for 3D human pose triangulation and evaluate its performance across different datasets and spatial camera arrangements.
The proposed pose triangulation model successfully generalizes to different camera arrangements and between two public datasets.
arXiv Detail & Related papers (2021-10-01T09:26:25Z) - MetaPose: Fast 3D Pose from Multiple Views without 3D Supervision [72.5863451123577]
We show how to train a neural model that can perform accurate 3D pose and camera estimation.
Our method outperforms both classical bundle adjustment and weakly-supervised monocular 3D baselines.
arXiv Detail & Related papers (2021-08-10T18:39:56Z) - CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects
from Point Clouds [97.63549045541296]
We propose a unified framework that can handle 9DoF pose tracking for novel rigid object instances and per-part pose tracking for articulated objects.
Our method achieves new state-of-the-art performance on category-level rigid object pose (NOCS-REAL275) and articulated object pose benchmarks (SAPIEN, BMVC) at the fastest FPS 12.
arXiv Detail & Related papers (2021-04-08T00:14:58Z) - Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo [71.59494156155309]
Existing approaches for multi-view 3D pose estimation explicitly establish cross-view correspondences to group 2D pose detections from multiple camera views.
We present our multi-view 3D pose estimation approach based on plane sweep stereo to jointly address the cross-view fusion and 3D pose reconstruction in a single shot.
arXiv Detail & Related papers (2021-04-06T03:49:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.