Learning Eye-in-Hand Camera Calibration from a Single Image
- URL: http://arxiv.org/abs/2111.01245v2
- Date: Wed, 3 Nov 2021 20:10:18 GMT
- Title: Learning Eye-in-Hand Camera Calibration from a Single Image
- Authors: Eugene Valassakis, Kamil Dreczkowski, Edward Johns
- Abstract summary: Eye-in-hand camera calibration is a fundamental and long-studied problem in robotics.
We present a study on using learning-based methods for solving this problem online from a single RGB image.
- Score: 7.262048441360133
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Eye-in-hand camera calibration is a fundamental and long-studied problem in
robotics. We present a study on using learning-based methods for solving this
problem online from a single RGB image, whilst training our models with
entirely synthetic data. We study three main approaches: one direct regression
model that directly predicts the extrinsic matrix from an image, one sparse
correspondence model that regresses 2D keypoints and then uses PnP, and one
dense correspondence model that uses regressed depth and segmentation maps to
enable ICP pose estimation. In our experiments, we benchmark these methods
against each other and against well-established classical methods, to find the
surprising result that direct regression outperforms other approaches, and we
perform noise-sensitivity analysis to gain further insights into these results.
Related papers
- Personalized 3D Human Pose and Shape Refinement [19.082329060985455]
regression-based methods have dominated the field of 3D human pose and shape estimation.
We propose to construct dense correspondences between initial human model estimates and the corresponding images.
We show that our approach not only consistently leads to better image-model alignment, but also to improved 3D accuracy.
arXiv Detail & Related papers (2024-03-18T10:13:53Z) - Multi-task Learning for Camera Calibration [3.274290296343038]
We present a unique method for predicting intrinsic (principal point offset and focal length) and extrinsic (baseline, pitch, and translation) properties from a pair of images.
By reconstructing the 3D points using a camera model neural network and then using the loss in reconstruction to obtain the camera specifications, this innovative camera projection loss (CPL) method allows us that the desired parameters should be estimated.
arXiv Detail & Related papers (2022-11-22T17:39:31Z) - Semantic keypoint-based pose estimation from single RGB frames [64.80395521735463]
We present an approach to estimating the continuous 6-DoF pose of an object from a single RGB image.
The approach combines semantic keypoints predicted by a convolutional network (convnet) with a deformable shape model.
We show that our approach can accurately recover the 6-DoF object pose for both instance- and class-based scenarios.
arXiv Detail & Related papers (2022-04-12T15:03:51Z) - Poseur: Direct Human Pose Regression with Transformers [119.79232258661995]
We propose a direct, regression-based approach to 2D human pose estimation from single images.
Our framework is end-to-end differentiable, and naturally learns to exploit the dependencies between keypoints.
Ours is the first regression-based approach to perform favorably compared to the best heatmap-based pose estimation methods.
arXiv Detail & Related papers (2022-01-19T04:31:57Z) - Adversarial Parametric Pose Prior [106.12437086990853]
We learn a prior that restricts the SMPL parameters to values that produce realistic poses via adversarial training.
We show that our learned prior covers the diversity of the real-data distribution, facilitates optimization for 3D reconstruction from 2D keypoints, and yields better pose estimates when used for regression from images.
arXiv Detail & Related papers (2021-12-08T10:05:32Z) - Camera Distortion-aware 3D Human Pose Estimation in Video with
Optimization-based Meta-Learning [23.200130129530653]
Existing 3D human pose estimation algorithms trained on distortion-free datasets suffer performance drop when applied to new scenarios with a specific camera distortion.
We propose a simple yet effective model for 3D human pose estimation in video that can quickly adapt to any distortion environment.
arXiv Detail & Related papers (2021-11-30T01:35:04Z) - Camera Calibration through Camera Projection Loss [4.36572039512405]
We propose a novel method to predict intrinsic (focal length and principal point offset) parameters using an image pair.
Unlike existing methods, we proposed a new representation that incorporates camera model equations as a neural network in multi-task learning framework.
Our proposed approach achieves better performance with respect to both deep learning-based and traditional methods on 7 out of 10 parameters evaluated.
arXiv Detail & Related papers (2021-10-07T14:03:10Z) - Reassessing the Limitations of CNN Methods for Camera Pose Regression [27.86655424544118]
We propose a model that can directly regress the camera pose from images with significantly higher accuracy than existing methods of the same class.
We first analyse why regression methods are still behind the state-of-the-art, and we bridge the performance gap with our new approach.
arXiv Detail & Related papers (2021-08-16T17:55:26Z) - Wide-angle Image Rectification: A Survey [86.36118799330802]
wide-angle images contain distortions that violate the assumptions underlying pinhole camera models.
Image rectification, which aims to correct these distortions, can solve these problems.
We present a detailed description and discussion of the camera models used in different approaches.
Next, we review both traditional geometry-based image rectification methods and deep learning-based methods.
arXiv Detail & Related papers (2020-10-30T17:28:40Z) - Neural Descent for Visual 3D Human Pose and Shape [67.01050349629053]
We present deep neural network methodology to reconstruct the 3d pose and shape of people, given an input RGB image.
We rely on a recently introduced, expressivefull body statistical 3d human model, GHUM, trained end-to-end.
Central to our methodology, is a learning to learn and optimize approach, referred to as HUmanNeural Descent (HUND), which avoids both second-order differentiation.
arXiv Detail & Related papers (2020-08-16T13:38:41Z) - Deep Keypoint-Based Camera Pose Estimation with Geometric Constraints [80.60538408386016]
Estimating relative camera poses from consecutive frames is a fundamental problem in visual odometry.
We propose an end-to-end trainable framework consisting of learnable modules for detection, feature extraction, matching and outlier rejection.
arXiv Detail & Related papers (2020-07-29T21:41:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.