Related papers: Camera Calibration through Geometric Constraints from Rotation and Projection Matrices

Camera Calibration through Geometric Constraints from Rotation and Projection Matrices

URL: http://arxiv.org/abs/2402.08437v2
Date: Tue, 20 Feb 2024 12:25:58 GMT
Title: Camera Calibration through Geometric Constraints from Rotation and Projection Matrices
Authors: Muhammad Waleed, Abdul Rauf, Murtaza Taj
Abstract summary: We propose a novel constraints-based loss for measuring the intrinsic and extrinsic parameters of a camera. Our methodology is a hybrid approach that employs the learning power of a neural network to estimate the desired parameters. Our proposed approach demonstrates improvements across all parameters when compared to the state-of-the-art (SOTA) benchmarks.
Score: 4.100632594106989
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The process of camera calibration involves estimating the intrinsic and extrinsic parameters, which are essential for accurately performing tasks such as 3D reconstruction, object tracking and augmented reality. In this work, we propose a novel constraints-based loss for measuring the intrinsic (focal length: $(f_x, f_y)$ and principal point: $(p_x, p_y)$) and extrinsic (baseline: ($b$), disparity: ($d$), translation: $(t_x, t_y, t_z)$, and rotation specifically pitch: $(\theta_p)$) camera parameters. Our novel constraints are based on geometric properties inherent in the camera model, including the anatomy of the projection matrix (vanishing points, image of world origin, axis planes) and the orthonormality of the rotation matrix. Thus we proposed a novel Unsupervised Geometric Constraint Loss (UGCL) via a multitask learning framework. Our methodology is a hybrid approach that employs the learning power of a neural network to estimate the desired parameters along with the underlying mathematical properties inherent in the camera projection matrix. This distinctive approach not only enhances the interpretability of the model but also facilitates a more informed learning process. Additionally, we introduce a new CVGL Camera Calibration dataset, featuring over 900 configurations of camera parameters, incorporating 63,600 image pairs that closely mirror real-world conditions. By training and testing on both synthetic and real-world datasets, our proposed approach demonstrates improvements across all parameters when compared to the state-of-the-art (SOTA) benchmarks. The code and the updated dataset can be found here: https://github.com/CVLABLUMS/CVGL-Camera-Calibration

Related papers

Coca-Splat: Collaborative Optimization for Camera Parameters and 3D Gaussians [26.3996055215988]
Coca-Splat is a novel approach to address the challenges of sparse view pose-free scene reconstruction and novel view synthesis (NVS) Inspired by deformable DEtection TRansformer, we design separate queries for 3D Gaussians and camera parameters. We update them layer by layer through deformable Transformer layers, enabling joint optimization in a single network.
arXiv Detail & Related papers (2025-04-01T10:48:46Z)
LoRA3D: Low-Rank Self-Calibration of 3D Geometric Foundation Models [36.17554088247551]
LoRA3D is an efficient self-calibration pipeline for 3D geometric foundation models.<n>We leverage robust optimization techniques to refine multi-view predictions and align them into a global coordinate frame.<n>Our method completes the self-calibration process on a $textbfsingle standard GPU within just 5 minutes$.
arXiv Detail & Related papers (2024-12-10T18:45:04Z)
GPS-Gaussian+: Generalizable Pixel-wise 3D Gaussian Splatting for Real-Time Human-Scene Rendering from Sparse Views [67.34073368933814]
We propose a generalizable Gaussian Splatting approach for high-resolution image rendering under a sparse-view camera setting. We train our Gaussian parameter regression module on human-only data or human-scene data, jointly with a depth estimation module to lift 2D parameter maps to 3D space. Experiments on several datasets demonstrate that our method outperforms state-of-the-art methods while achieving an exceeding rendering speed.
arXiv Detail & Related papers (2024-11-18T08:18:44Z)
Gravity-aligned Rotation Averaging with Circular Regression [53.81374943525774]
We introduce a principled approach that integrates gravity direction into the rotation averaging phase of global pipelines. We achieve state-of-the-art accuracy on four large-scale datasets.
arXiv Detail & Related papers (2024-10-16T17:37:43Z)
Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation [74.28509379811084]
Metric3D v2 is a geometric foundation model for zero-shot metric depth and surface normal estimation from a single image. We propose solutions for both metric depth estimation and surface normal estimation. Our method enables the accurate recovery of metric 3D structures on randomly collected internet images.
arXiv Detail & Related papers (2024-03-22T02:30:46Z)
Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image [85.91935485902708]
We show that the key to a zero-shot single-view metric depth model lies in the combination of large-scale data training and resolving the metric ambiguity from various camera models. We propose a canonical camera space transformation module, which explicitly addresses the ambiguity problems and can be effortlessly plugged into existing monocular models. Our method enables the accurate recovery of metric 3D structures on randomly collected internet images.
arXiv Detail & Related papers (2023-07-20T16:14:23Z)
DIME-Net: Neural Network-Based Dynamic Intrinsic Parameter Rectification for Cameras with Optical Image Stabilization System [16.390775530663618]
We propose a novel neural network-based approach that estimates pose estimation or 3D reconstruction in real-time. We name the proposed Dynamic Intrinsic pose estimation network as DIME-Net and have it implemented and tested on three different mobile devices. In all cases, DIME-Net can reduce reprojection error by at least $64$% indicating that our design is successful.
arXiv Detail & Related papers (2023-03-20T17:45:12Z)
Multi-task Learning for Camera Calibration [3.274290296343038]
We present a unique method for predicting intrinsic (principal point offset and focal length) and extrinsic (baseline, pitch, and translation) properties from a pair of images. By reconstructing the 3D points using a camera model neural network and then using the loss in reconstruction to obtain the camera specifications, this innovative camera projection loss (CPL) method allows us that the desired parameters should be estimated.
arXiv Detail & Related papers (2022-11-22T17:39:31Z)
Self-Supervised Camera Self-Calibration from Video [34.35533943247917]
We propose a learning algorithm to regress per-sequence calibration parameters using an efficient family of general camera models. Our procedure achieves self-calibration results with sub-pixel reprojection error, outperforming other learning-based methods.
arXiv Detail & Related papers (2021-12-06T19:42:05Z)
Camera Calibration through Camera Projection Loss [4.36572039512405]
We propose a novel method to predict intrinsic (focal length and principal point offset) parameters using an image pair. Unlike existing methods, we proposed a new representation that incorporates camera model equations as a neural network in multi-task learning framework. Our proposed approach achieves better performance with respect to both deep learning-based and traditional methods on 7 out of 10 parameters evaluated.
arXiv Detail & Related papers (2021-10-07T14:03:10Z)
Self-Calibrating Neural Radiance Fields [68.64327335620708]
We jointly learn the geometry of the scene and the accurate camera parameters without any calibration objects. Our camera model consists of a pinhole model, a fourth order radial distortion, and a generic noise model that can learn arbitrary non-linear camera distortions.
arXiv Detail & Related papers (2021-08-31T13:34:28Z)
Uncertainty-Aware Camera Pose Estimation from Points and Lines [101.03675842534415]
Perspective-n-Point-and-Line (Pn$PL) aims at fast, accurate and robust camera localizations with respect to a 3D model from 2D-3D feature coordinates.
arXiv Detail & Related papers (2021-07-08T15:19:36Z)
FLEX: Parameter-free Multi-view 3D Human Motion Reconstruction [70.09086274139504]
Multi-view algorithms strongly depend on camera parameters, in particular, the relative positions among the cameras. We introduce FLEX, an end-to-end parameter-free multi-view model. We demonstrate results on the Human3.6M and KTH Multi-view Football II datasets.
arXiv Detail & Related papers (2021-05-05T09:08:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.