Related papers: GeoCalib: Learning Single-image Calibration with Geometric Optimization

GeoCalib: Learning Single-image Calibration with Geometric Optimization

URL: http://arxiv.org/abs/2409.06704v2
Date: Thu, 17 Oct 2024 07:14:12 GMT
Title: GeoCalib: Learning Single-image Calibration with Geometric Optimization
Authors: Alexander Veicht, Paul-Edouard Sarlin, Philipp Lindenberger, Marc Pollefeys,
Abstract summary: From a single image, visual cues can help deduce intrinsic and extrinsic camera parameters like the focal length and the gravity direction. Current approaches to this problem are based on either classical geometry with lines and vanishing points or on deep neural networks trained end-to-end. We introduce GeoCalib, a deep neural network that leverages universal rules of 3D geometry through an optimization process.
Score: 89.84142934465685
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: From a single image, visual cues can help deduce intrinsic and extrinsic camera parameters like the focal length and the gravity direction. This single-image calibration can benefit various downstream applications like image editing and 3D mapping. Current approaches to this problem are based on either classical geometry with lines and vanishing points or on deep neural networks trained end-to-end. The learned approaches are more robust but struggle to generalize to new environments and are less accurate than their classical counterparts. We hypothesize that they lack the constraints that 3D geometry provides. In this work, we introduce GeoCalib, a deep neural network that leverages universal rules of 3D geometry through an optimization process. GeoCalib is trained end-to-end to estimate camera parameters and learns to find useful visual cues from the data. Experiments on various benchmarks show that GeoCalib is more robust and more accurate than existing classical and learned approaches. Its internal optimization estimates uncertainties, which help flag failure cases and benefit downstream applications like visual localization. The code and trained models are publicly available at https://github.com/cvg/GeoCalib.

Related papers

To Glue or Not to Glue? Classical vs Learned Image Matching for Mobile Mapping Cameras to Textured Semantic 3D Building Models [5.4693951128908935]
This work systematically evaluates the effectiveness of different feature-matching techniques in visual localization using textured CityGML LoD2 models.<n>The results indicate that the learnable feature matching methods vastly outperform traditional approaches regarding accuracy and robustness.
arXiv Detail & Related papers (2025-05-23T14:41:41Z)
FLARE: Feed-forward Geometry, Appearance and Camera Estimation from Uncalibrated Sparse Views [93.6881532277553]
We present FLARE, a feed-forward model designed to infer high-quality camera poses and 3D geometry from uncalibrated sparse-view images. Our solution features a cascaded learning paradigm with camera pose serving as the critical bridge, recognizing its essential role in mapping 3D structures onto 2D image planes.
arXiv Detail & Related papers (2025-02-17T18:54:05Z)
GeomGS: LiDAR-Guided Geometry-Aware Gaussian Splatting for Robot Localization [20.26969580492428]
We propose a novel 3DGS method called Geometry-Aware Gaussian Splatting (GeomGS) Our GeomGS demonstrates state-of-the-art geometric and localization performance across several benchmarks, while also improving photometric performance.
arXiv Detail & Related papers (2025-01-23T06:43:38Z)
GSGTrack: Gaussian Splatting-Guided Object Pose Tracking from RGB Videos [18.90495041083675]
We introduce GSGTrack, a novel RGB-based pose tracking framework. We propose an object silhouette loss to address the issue of pixel-wise loss being overly sensitive to pose noise during tracking. Experiments on the OnePose and HO3D demonstrate the effectiveness of GSGTrack in both 6DoF pose tracking and object reconstruction.
arXiv Detail & Related papers (2024-12-03T08:38:44Z)
GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation [65.33726478659304]
We introduce the Geometry-Aware Large Reconstruction Model (GeoLRM), an approach which can predict high-quality assets with 512k Gaussians and 21 input images in only 11 GB GPU memory. Previous works neglect the inherent sparsity of 3D structure and do not utilize explicit geometric relationships between 3D and 2D images. GeoLRM tackles these issues by incorporating a novel 3D-aware transformer structure that directly processes 3D points and uses deformable cross-attention mechanisms.
arXiv Detail & Related papers (2024-06-21T17:49:31Z)
Depth-Regularized Optimization for 3D Gaussian Splatting in Few-Shot Images [47.14713579719103]
We introduce a dense depth map as a geometry guide to mitigate overfitting. The adjusted depth aids in the color-based optimization of 3D Gaussian splatting. We verify the proposed method on the NeRF-LLFF dataset with varying numbers of few images.
arXiv Detail & Related papers (2023-11-22T13:53:04Z)
FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models [67.96827539201071]
We propose a novel test-time optimization approach for 3D scene reconstruction. Our method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
arXiv Detail & Related papers (2023-08-10T17:55:02Z)
NeuroGF: A Neural Representation for Fast Geodesic Distance and Path Queries [77.04220651098723]
This paper presents the first attempt to represent geodesics on 3D mesh models using neural implicit functions. Specifically, we introduce neural geodesic fields (NeuroGFs), which are learned to represent the all-pairs geodesics of a given mesh. NeuroGFs exhibit exceptional performance in solving the single-source all-destination (SSAD) and point-to-point geodesics.
arXiv Detail & Related papers (2023-06-01T13:32:21Z)
GeoNeRF: Generalizing NeRF with Geometry Priors [2.578242050187029]
We present GeoNeRF, a generalizable photorealistic novel view method based on neural radiance fields. Our approach consists of two main stages: a geometry reasoner and a synthesis. Experiments show that GeoNeRF outperforms state-of-the-art generalizable neural rendering models on various synthetic and real datasets.
arXiv Detail & Related papers (2021-11-26T15:15:37Z)
Self-Calibrating Neural Radiance Fields [68.64327335620708]
We jointly learn the geometry of the scene and the accurate camera parameters without any calibration objects. Our camera model consists of a pinhole model, a fourth order radial distortion, and a generic noise model that can learn arbitrary non-linear camera distortions.
arXiv Detail & Related papers (2021-08-31T13:34:28Z)
Back to the Feature: Learning Robust Camera Localization from Pixels to Pose [114.89389528198738]
We introduce PixLoc, a scene-agnostic neural network that estimates an accurate 6-DoF pose from an image and a 3D model. The system can localize in large environments given coarse pose priors but also improve the accuracy of sparse feature matching.
arXiv Detail & Related papers (2021-03-16T17:40:12Z)
Towards Dense People Detection with Deep Learning and Depth images [9.376814409561726]
This paper proposes a DNN-based system that detects multiple people from a single depth image. Our neural network processes a depth image and outputs a likelihood map in image coordinates. We show this strategy to be effective, producing networks that generalize to work with scenes different from those used during training.
arXiv Detail & Related papers (2020-07-14T16:43:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.