Related papers: Multi-task Learning for Camera Calibration

Multi-task Learning for Camera Calibration

URL: http://arxiv.org/abs/2211.12432v1
Date: Tue, 22 Nov 2022 17:39:31 GMT
Title: Multi-task Learning for Camera Calibration
Authors: Talha Hanif Butt, Murtaza Taj
Abstract summary: We present a unique method for predicting intrinsic (principal point offset and focal length) and extrinsic (baseline, pitch, and translation) properties from a pair of images. By reconstructing the 3D points using a camera model neural network and then using the loss in reconstruction to obtain the camera specifications, this innovative camera projection loss (CPL) method allows us that the desired parameters should be estimated.
Score: 3.274290296343038
License: http://creativecommons.org/licenses/by/4.0/
Abstract: For a number of tasks, such as 3D reconstruction, robotic interface, autonomous driving, etc., camera calibration is essential. In this study, we present a unique method for predicting intrinsic (principal point offset and focal length) and extrinsic (baseline, pitch, and translation) properties from a pair of images. We suggested a novel method where camera model equations are represented as a neural network in a multi-task learning framework, in contrast to existing methods, which build a comprehensive solution. By reconstructing the 3D points using a camera model neural network and then using the loss in reconstruction to obtain the camera specifications, this innovative camera projection loss (CPL) method allows us that the desired parameters should be estimated. As far as we are aware, our approach is the first one that uses an approach to multi-task learning that includes mathematical formulas in a framework for learning to estimate camera parameters to predict both the extrinsic and intrinsic parameters jointly. Additionally, we provided a new dataset named as CVGL Camera Calibration Dataset [1] which has been collected using the CARLA Simulator [2]. Actually, we show that our suggested strategy out performs both conventional methods and methods based on deep learning on 8 out of 10 parameters that were assessed using both real and synthetic data. Our code and generated dataset are available at https://github.com/thanif/Camera-Calibration-through-Camera-Projection-Loss.

Related papers

FLARE: Feed-forward Geometry, Appearance and Camera Estimation from Uncalibrated Sparse Views [93.6881532277553]
We present FLARE, a feed-forward model designed to infer high-quality camera poses and 3D geometry from uncalibrated sparse-view images. Our solution features a cascaded learning paradigm with camera pose serving as the critical bridge, recognizing its essential role in mapping 3D structures onto 2D image planes.
arXiv Detail & Related papers (2025-02-17T18:54:05Z)
No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images [100.80376573969045]
NoPoSplat is a feed-forward model capable of reconstructing 3D scenes parameterized by 3D Gaussians from multi-view images. Our model achieves real-time 3D Gaussian reconstruction during inference. This work makes significant advances in pose-free generalizable 3D reconstruction and demonstrates its applicability to real-world scenarios.
arXiv Detail & Related papers (2024-10-31T17:58:22Z)
Camera Calibration through Geometric Constraints from Rotation and Projection Matrices [4.100632594106989]
We propose a novel constraints-based loss for measuring the intrinsic and extrinsic parameters of a camera. Our methodology is a hybrid approach that employs the learning power of a neural network to estimate the desired parameters. Our proposed approach demonstrates improvements across all parameters when compared to the state-of-the-art (SOTA) benchmarks.
arXiv Detail & Related papers (2024-02-13T13:07:34Z)
Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image [85.91935485902708]
We show that the key to a zero-shot single-view metric depth model lies in the combination of large-scale data training and resolving the metric ambiguity from various camera models. We propose a canonical camera space transformation module, which explicitly addresses the ambiguity problems and can be effortlessly plugged into existing monocular models. Our method enables the accurate recovery of metric 3D structures on randomly collected internet images.
arXiv Detail & Related papers (2023-07-20T16:14:23Z)
RelPose: Predicting Probabilistic Relative Rotation for Single Objects in the Wild [73.1276968007689]
We describe a data-driven method for inferring the camera viewpoints given multiple images of an arbitrary object. We show that our approach outperforms state-of-the-art SfM and SLAM methods given sparse images on both seen and unseen categories.
arXiv Detail & Related papers (2022-08-11T17:59:59Z)
Self-Supervised Camera Self-Calibration from Video [34.35533943247917]
We propose a learning algorithm to regress per-sequence calibration parameters using an efficient family of general camera models. Our procedure achieves self-calibration results with sub-pixel reprojection error, outperforming other learning-based methods.
arXiv Detail & Related papers (2021-12-06T19:42:05Z)
Rethinking Generic Camera Models for Deep Single Image Camera Calibration to Recover Rotation and Fisheye Distortion [8.877834897951578]
We propose a generic camera model that has the potential to address various types of distortion. Our proposed method outperformed conventional methods on two largescale datasets and images captured by off-the-shelf fisheye cameras.
arXiv Detail & Related papers (2021-11-25T05:58:23Z)
Learning Eye-in-Hand Camera Calibration from a Single Image [7.262048441360133]
Eye-in-hand camera calibration is a fundamental and long-studied problem in robotics. We present a study on using learning-based methods for solving this problem online from a single RGB image.
arXiv Detail & Related papers (2021-11-01T20:17:31Z)
Camera Calibration through Camera Projection Loss [4.36572039512405]
We propose a novel method to predict intrinsic (focal length and principal point offset) parameters using an image pair. Unlike existing methods, we proposed a new representation that incorporates camera model equations as a neural network in multi-task learning framework. Our proposed approach achieves better performance with respect to both deep learning-based and traditional methods on 7 out of 10 parameters evaluated.
arXiv Detail & Related papers (2021-10-07T14:03:10Z)
Self-Calibrating Neural Radiance Fields [68.64327335620708]
We jointly learn the geometry of the scene and the accurate camera parameters without any calibration objects. Our camera model consists of a pinhole model, a fourth order radial distortion, and a generic noise model that can learn arbitrary non-linear camera distortions.
arXiv Detail & Related papers (2021-08-31T13:34:28Z)
Infrastructure-based Multi-Camera Calibration using Radial Projections [117.22654577367246]
Pattern-based calibration techniques can be used to calibrate the intrinsics of the cameras individually. Infrastucture-based calibration techniques are able to estimate the extrinsics using 3D maps pre-built via SLAM or Structure-from-Motion. We propose to fully calibrate a multi-camera system from scratch using an infrastructure-based approach.
arXiv Detail & Related papers (2020-07-30T09:21:04Z)
Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled Representation [57.11299763566534]
We present a solution to recover 3D pose from multi-view images captured with spatially calibrated cameras. We exploit 3D geometry to fuse input images into a unified latent representation of pose, which is disentangled from camera view-points. Our architecture then conditions the learned representation on camera projection operators to produce accurate per-view 2d detections.
arXiv Detail & Related papers (2020-04-05T12:52:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.