3D-Aware Ellipse Prediction for Object-Based Camera Pose Estimation
- URL: http://arxiv.org/abs/2105.11494v1
- Date: Mon, 24 May 2021 18:40:18 GMT
- Title: 3D-Aware Ellipse Prediction for Object-Based Camera Pose Estimation
- Authors: Matthieu Zins, Gilles Simon, Marie-Odile Berger
- Abstract summary: We propose a method for coarse camera pose computation which is robust to viewing conditions.
It exploits the ability of deep learning techniques to reliably detect objects regardless of viewing conditions.
- Score: 3.103806775802078
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we propose a method for coarse camera pose computation which
is robust to viewing conditions and does not require a detailed model of the
scene. This method meets the growing need of easy deployment of robotics or
augmented reality applications in any environments, especially those for which
no accurate 3D model nor huge amount of ground truth data are available. It
exploits the ability of deep learning techniques to reliably detect objects
regardless of viewing conditions. Previous works have also shown that
abstracting the geometry of a scene of objects by an ellipsoid cloud allows to
compute the camera pose accurately enough for various application needs. Though
promising, these approaches use the ellipses fitted to the detection bounding
boxes as an approximation of the imaged objects. In this paper, we go one step
further and propose a learning-based method which detects improved elliptic
approximations of objects which are coherent with the 3D ellipsoid in terms of
perspective projection. Experiments prove that the accuracy of the computed
pose significantly increases thanks to our method and is more robust to the
variability of the boundaries of the detection boxes. This is achieved with
very little effort in terms of training data acquisition -- a few hundred
calibrated images of which only three need manual object annotation. Code and
models are released at
https://github.com/zinsmatt/3D-Aware-Ellipses-for-Visual-Localization.
Related papers
- LocaliseBot: Multi-view 3D object localisation with differentiable
rendering for robot grasping [9.690844449175948]
We focus on object pose estimation.
Our approach relies on three pieces of information: multiple views of the object, the camera's parameters at those viewpoints, and 3D CAD models of objects.
We show that the estimated object pose results in 99.65% grasp accuracy with the ground truth grasp candidates.
arXiv Detail & Related papers (2023-11-14T14:27:53Z) - FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models [67.96827539201071]
We propose a novel test-time optimization approach for 3D scene reconstruction.
Our method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
arXiv Detail & Related papers (2023-08-10T17:55:02Z) - ShapeShift: Superquadric-based Object Pose Estimation for Robotic
Grasping [85.38689479346276]
Current techniques heavily rely on a reference 3D object, limiting their generalizability and making it expensive to expand to new object categories.
This paper proposes ShapeShift, a superquadric-based framework for object pose estimation that predicts the object's pose relative to a primitive shape which is fitted to the object.
arXiv Detail & Related papers (2023-04-10T20:55:41Z) - 6D Object Pose Estimation from Approximate 3D Models for Orbital
Robotics [19.64111218032901]
We present a novel technique to estimate the 6D pose of objects from single images.
We employ a dense 2D-to-3D correspondence predictor that regresses 3D model coordinates for every pixel.
Our method achieves state-of-the-art performance on the SPEED+ dataset and has won the SPEC2021 post-mortem competition.
arXiv Detail & Related papers (2023-03-23T13:18:05Z) - Unseen Object 6D Pose Estimation: A Benchmark and Baselines [62.8809734237213]
We propose a new task that enables and facilitates algorithms to estimate the 6D pose estimation of novel objects during testing.
We collect a dataset with both real and synthetic images and up to 48 unseen objects in the test set.
By training an end-to-end 3D correspondences network, our method finds corresponding points between an unseen object and a partial view RGBD image accurately and efficiently.
arXiv Detail & Related papers (2022-06-23T16:29:53Z) - Object-Based Visual Camera Pose Estimation From Ellipsoidal Model and
3D-Aware Ellipse Prediction [2.016317500787292]
We propose a method for initial camera pose estimation from just a single image.
It exploits the ability of deep learning techniques to reliably detect objects regardless of viewing conditions.
Experiments prove that the accuracy of the computed pose significantly increases thanks to our method.
arXiv Detail & Related papers (2022-03-09T10:00:52Z) - Learning Stereopsis from Geometric Synthesis for 6D Object Pose
Estimation [11.999630902627864]
Current monocular-based 6D object pose estimation methods generally achieve less competitive results than RGBD-based methods.
This paper proposes a 3D geometric volume based pose estimation method with a short baseline two-view setting.
Experiments show that our method outperforms state-of-the-art monocular-based methods, and is robust in different objects and scenes.
arXiv Detail & Related papers (2021-09-25T02:55:05Z) - 3D Object Detection and Pose Estimation of Unseen Objects in Color
Images with Local Surface Embeddings [35.769234123059086]
We present an approach for detecting and estimating the 3D poses of objects in images that requires only an untextured CAD model.
Our approach combines Deep Learning and 3D geometry: It relies on an embedding of local 3D geometry to match the CAD models to the input images.
We show that we can use Mask-RCNN in a class-agnostic way to detect the new objects without retraining and thus drastically limit the number of possible correspondences.
arXiv Detail & Related papers (2020-10-08T15:57:06Z) - Shape and Viewpoint without Keypoints [63.26977130704171]
We present a learning framework that learns to recover the 3D shape, pose and texture from a single image.
We trained on an image collection without any ground truth 3D shape, multi-view, camera viewpoints or keypoint supervision.
We obtain state-of-the-art camera prediction results and show that we can learn to predict diverse shapes and textures across objects.
arXiv Detail & Related papers (2020-07-21T17:58:28Z) - Single View Metrology in the Wild [94.7005246862618]
We present a novel approach to single view metrology that can recover the absolute scale of a scene represented by 3D heights of objects or camera height above the ground.
Our method relies on data-driven priors learned by a deep network specifically designed to imbibe weakly supervised constraints from the interplay of the unknown camera with 3D entities such as object heights.
We demonstrate state-of-the-art qualitative and quantitative results on several datasets as well as applications including virtual object insertion.
arXiv Detail & Related papers (2020-07-18T22:31:33Z) - Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled
Representation [57.11299763566534]
We present a solution to recover 3D pose from multi-view images captured with spatially calibrated cameras.
We exploit 3D geometry to fuse input images into a unified latent representation of pose, which is disentangled from camera view-points.
Our architecture then conditions the learned representation on camera projection operators to produce accurate per-view 2d detections.
arXiv Detail & Related papers (2020-04-05T12:52:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.