Camera Calibration through Geometric Constraints from Rotation and
Projection Matrices
- URL: http://arxiv.org/abs/2402.08437v2
- Date: Tue, 20 Feb 2024 12:25:58 GMT
- Title: Camera Calibration through Geometric Constraints from Rotation and
Projection Matrices
- Authors: Muhammad Waleed, Abdul Rauf, Murtaza Taj
- Abstract summary: We propose a novel constraints-based loss for measuring the intrinsic and extrinsic parameters of a camera.
Our methodology is a hybrid approach that employs the learning power of a neural network to estimate the desired parameters.
Our proposed approach demonstrates improvements across all parameters when compared to the state-of-the-art (SOTA) benchmarks.
- Score: 4.100632594106989
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The process of camera calibration involves estimating the intrinsic and
extrinsic parameters, which are essential for accurately performing tasks such
as 3D reconstruction, object tracking and augmented reality. In this work, we
propose a novel constraints-based loss for measuring the intrinsic (focal
length: $(f_x, f_y)$ and principal point: $(p_x, p_y)$) and extrinsic
(baseline: ($b$), disparity: ($d$), translation: $(t_x, t_y, t_z)$, and
rotation specifically pitch: $(\theta_p)$) camera parameters. Our novel
constraints are based on geometric properties inherent in the camera model,
including the anatomy of the projection matrix (vanishing points, image of
world origin, axis planes) and the orthonormality of the rotation matrix. Thus
we proposed a novel Unsupervised Geometric Constraint Loss (UGCL) via a
multitask learning framework. Our methodology is a hybrid approach that employs
the learning power of a neural network to estimate the desired parameters along
with the underlying mathematical properties inherent in the camera projection
matrix. This distinctive approach not only enhances the interpretability of the
model but also facilitates a more informed learning process. Additionally, we
introduce a new CVGL Camera Calibration dataset, featuring over 900
configurations of camera parameters, incorporating 63,600 image pairs that
closely mirror real-world conditions. By training and testing on both synthetic
and real-world datasets, our proposed approach demonstrates improvements across
all parameters when compared to the state-of-the-art (SOTA) benchmarks. The
code and the updated dataset can be found here:
https://github.com/CVLABLUMS/CVGL-Camera-Calibration
Related papers
- Gravity-aligned Rotation Averaging with Circular Regression [53.81374943525774]
We introduce a principled approach that integrates gravity direction into the rotation averaging phase of global pipelines.
We achieve state-of-the-art accuracy on four large-scale datasets.
arXiv Detail & Related papers (2024-10-16T17:37:43Z) - Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation [74.28509379811084]
Metric3D v2 is a geometric foundation model for zero-shot metric depth and surface normal estimation from a single image.
We propose solutions for both metric depth estimation and surface normal estimation.
Our method enables the accurate recovery of metric 3D structures on randomly collected internet images.
arXiv Detail & Related papers (2024-03-22T02:30:46Z) - Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image [85.91935485902708]
We show that the key to a zero-shot single-view metric depth model lies in the combination of large-scale data training and resolving the metric ambiguity from various camera models.
We propose a canonical camera space transformation module, which explicitly addresses the ambiguity problems and can be effortlessly plugged into existing monocular models.
Our method enables the accurate recovery of metric 3D structures on randomly collected internet images.
arXiv Detail & Related papers (2023-07-20T16:14:23Z) - DIME-Net: Neural Network-Based Dynamic Intrinsic Parameter Rectification
for Cameras with Optical Image Stabilization System [16.390775530663618]
We propose a novel neural network-based approach that estimates pose estimation or 3D reconstruction in real-time.
We name the proposed Dynamic Intrinsic pose estimation network as DIME-Net and have it implemented and tested on three different mobile devices.
In all cases, DIME-Net can reduce reprojection error by at least $64$% indicating that our design is successful.
arXiv Detail & Related papers (2023-03-20T17:45:12Z) - Multi-task Learning for Camera Calibration [3.274290296343038]
We present a unique method for predicting intrinsic (principal point offset and focal length) and extrinsic (baseline, pitch, and translation) properties from a pair of images.
By reconstructing the 3D points using a camera model neural network and then using the loss in reconstruction to obtain the camera specifications, this innovative camera projection loss (CPL) method allows us that the desired parameters should be estimated.
arXiv Detail & Related papers (2022-11-22T17:39:31Z) - Self-Supervised Camera Self-Calibration from Video [34.35533943247917]
We propose a learning algorithm to regress per-sequence calibration parameters using an efficient family of general camera models.
Our procedure achieves self-calibration results with sub-pixel reprojection error, outperforming other learning-based methods.
arXiv Detail & Related papers (2021-12-06T19:42:05Z) - Camera Calibration through Camera Projection Loss [4.36572039512405]
We propose a novel method to predict intrinsic (focal length and principal point offset) parameters using an image pair.
Unlike existing methods, we proposed a new representation that incorporates camera model equations as a neural network in multi-task learning framework.
Our proposed approach achieves better performance with respect to both deep learning-based and traditional methods on 7 out of 10 parameters evaluated.
arXiv Detail & Related papers (2021-10-07T14:03:10Z) - Self-Calibrating Neural Radiance Fields [68.64327335620708]
We jointly learn the geometry of the scene and the accurate camera parameters without any calibration objects.
Our camera model consists of a pinhole model, a fourth order radial distortion, and a generic noise model that can learn arbitrary non-linear camera distortions.
arXiv Detail & Related papers (2021-08-31T13:34:28Z) - Uncertainty-Aware Camera Pose Estimation from Points and Lines [101.03675842534415]
Perspective-n-Point-and-Line (Pn$PL) aims at fast, accurate and robust camera localizations with respect to a 3D model from 2D-3D feature coordinates.
arXiv Detail & Related papers (2021-07-08T15:19:36Z) - FLEX: Parameter-free Multi-view 3D Human Motion Reconstruction [70.09086274139504]
Multi-view algorithms strongly depend on camera parameters, in particular, the relative positions among the cameras.
We introduce FLEX, an end-to-end parameter-free multi-view model.
We demonstrate results on the Human3.6M and KTH Multi-view Football II datasets.
arXiv Detail & Related papers (2021-05-05T09:08:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.