GeoCalib: Learning Single-image Calibration with Geometric Optimization
- URL: http://arxiv.org/abs/2409.06704v2
- Date: Thu, 17 Oct 2024 07:14:12 GMT
- Title: GeoCalib: Learning Single-image Calibration with Geometric Optimization
- Authors: Alexander Veicht, Paul-Edouard Sarlin, Philipp Lindenberger, Marc Pollefeys,
- Abstract summary: From a single image, visual cues can help deduce intrinsic and extrinsic camera parameters like the focal length and the gravity direction.
Current approaches to this problem are based on either classical geometry with lines and vanishing points or on deep neural networks trained end-to-end.
We introduce GeoCalib, a deep neural network that leverages universal rules of 3D geometry through an optimization process.
- Score: 89.84142934465685
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: From a single image, visual cues can help deduce intrinsic and extrinsic camera parameters like the focal length and the gravity direction. This single-image calibration can benefit various downstream applications like image editing and 3D mapping. Current approaches to this problem are based on either classical geometry with lines and vanishing points or on deep neural networks trained end-to-end. The learned approaches are more robust but struggle to generalize to new environments and are less accurate than their classical counterparts. We hypothesize that they lack the constraints that 3D geometry provides. In this work, we introduce GeoCalib, a deep neural network that leverages universal rules of 3D geometry through an optimization process. GeoCalib is trained end-to-end to estimate camera parameters and learns to find useful visual cues from the data. Experiments on various benchmarks show that GeoCalib is more robust and more accurate than existing classical and learned approaches. Its internal optimization estimates uncertainties, which help flag failure cases and benefit downstream applications like visual localization. The code and trained models are publicly available at https://github.com/cvg/GeoCalib.
Related papers
- FLARE: Feed-forward Geometry, Appearance and Camera Estimation from Uncalibrated Sparse Views [93.6881532277553]
We present FLARE, a feed-forward model designed to infer high-quality camera poses and 3D geometry from uncalibrated sparse-view images.
Our solution features a cascaded learning paradigm with camera pose serving as the critical bridge, recognizing its essential role in mapping 3D structures onto 2D image planes.
arXiv Detail & Related papers (2025-02-17T18:54:05Z) - GeomGS: LiDAR-Guided Geometry-Aware Gaussian Splatting for Robot Localization [20.26969580492428]
We propose a novel 3DGS method called Geometry-Aware Gaussian Splatting (GeomGS)
Our GeomGS demonstrates state-of-the-art geometric and localization performance across several benchmarks, while also improving photometric performance.
arXiv Detail & Related papers (2025-01-23T06:43:38Z) - GSGTrack: Gaussian Splatting-Guided Object Pose Tracking from RGB Videos [18.90495041083675]
We introduce GSGTrack, a novel RGB-based pose tracking framework.
We propose an object silhouette loss to address the issue of pixel-wise loss being overly sensitive to pose noise during tracking.
Experiments on the OnePose and HO3D demonstrate the effectiveness of GSGTrack in both 6DoF pose tracking and object reconstruction.
arXiv Detail & Related papers (2024-12-03T08:38:44Z) - GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation [65.33726478659304]
We introduce the Geometry-Aware Large Reconstruction Model (GeoLRM), an approach which can predict high-quality assets with 512k Gaussians and 21 input images in only 11 GB GPU memory.
Previous works neglect the inherent sparsity of 3D structure and do not utilize explicit geometric relationships between 3D and 2D images.
GeoLRM tackles these issues by incorporating a novel 3D-aware transformer structure that directly processes 3D points and uses deformable cross-attention mechanisms.
arXiv Detail & Related papers (2024-06-21T17:49:31Z) - FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models [67.96827539201071]
We propose a novel test-time optimization approach for 3D scene reconstruction.
Our method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
arXiv Detail & Related papers (2023-08-10T17:55:02Z) - NeuroGF: A Neural Representation for Fast Geodesic Distance and Path
Queries [77.04220651098723]
This paper presents the first attempt to represent geodesics on 3D mesh models using neural implicit functions.
Specifically, we introduce neural geodesic fields (NeuroGFs), which are learned to represent the all-pairs geodesics of a given mesh.
NeuroGFs exhibit exceptional performance in solving the single-source all-destination (SSAD) and point-to-point geodesics.
arXiv Detail & Related papers (2023-06-01T13:32:21Z) - GeoNeRF: Generalizing NeRF with Geometry Priors [2.578242050187029]
We present GeoNeRF, a generalizable photorealistic novel view method based on neural radiance fields.
Our approach consists of two main stages: a geometry reasoner and a synthesis.
Experiments show that GeoNeRF outperforms state-of-the-art generalizable neural rendering models on various synthetic and real datasets.
arXiv Detail & Related papers (2021-11-26T15:15:37Z) - Back to the Feature: Learning Robust Camera Localization from Pixels to
Pose [114.89389528198738]
We introduce PixLoc, a scene-agnostic neural network that estimates an accurate 6-DoF pose from an image and a 3D model.
The system can localize in large environments given coarse pose priors but also improve the accuracy of sparse feature matching.
arXiv Detail & Related papers (2021-03-16T17:40:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.