SilhoNet-Fisheye: Adaptation of A ROI Based Object Pose Estimation
Network to Monocular Fisheye Images
- URL: http://arxiv.org/abs/2002.12415v1
- Date: Thu, 27 Feb 2020 19:57:33 GMT
- Title: SilhoNet-Fisheye: Adaptation of A ROI Based Object Pose Estimation
Network to Monocular Fisheye Images
- Authors: Gideon Billings, Matthew Johnson-Roberson
- Abstract summary: We present a novel framework for adapting a ROI-based 6D object pose estimation method to work on full fisheye images.
We also contribute a fisheye image dataset, called UWHandles, with 6D object pose and 2D bounding box annotations.
- Score: 15.573003283204958
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There has been much recent interest in deep learning methods for monocular
image based object pose estimation. While object pose estimation is an
important problem for autonomous robot interaction with the physical world, and
the application space for monocular-based methods is expansive, there has been
little work on applying these methods with fisheye imaging systems. Also,
little exists in the way of annotated fisheye image datasets on which these
methods can be developed and tested. The research landscape is even more sparse
for object detection methods applied in the underwater domain, fisheye image
based or otherwise. In this work, we present a novel framework for adapting a
ROI-based 6D object pose estimation method to work on full fisheye images. The
method incorporates the gnomic projection of regions of interest from an
intermediate spherical image representation to correct for the fisheye
distortions. Further, we contribute a fisheye image dataset, called UWHandles,
collected in natural underwater environments, with 6D object pose and 2D
bounding box annotations.
Related papers
- FisheyeDepth: A Real Scale Self-Supervised Depth Estimation Model for Fisheye Camera [8.502741852406904]
We present FisheyeDepth, a self-supervised depth estimation model tailored for fisheye cameras.
We incorporate a fisheye camera model into the projection and reprojection stages during training to handle image distortions.
We also incorporate real-scale pose information into the geometric projection between consecutive frames, replacing the poses estimated by the conventional pose network.
arXiv Detail & Related papers (2024-09-23T14:31:42Z) - RoFIR: Robust Fisheye Image Rectification Framework Impervious to Optical Center Deviation [88.54817424560056]
We propose a distortion vector map (DVM) that measures the degree and direction of local distortion.
By learning the DVM, the model can independently identify local distortions at each pixel without relying on global distortion patterns.
In the pre-training stage, it predicts the distortion vector map and perceives the local distortion features of each pixel.
In the fine-tuning stage, it predicts a pixel-wise flow map for deviated fisheye image rectification.
arXiv Detail & Related papers (2024-06-27T06:38:56Z) - MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare [84.80956484848505]
MegaPose is a method to estimate the 6D pose of novel objects, that is, objects unseen during training.
We present a 6D pose refiner based on a render&compare strategy which can be applied to novel objects.
Second, we introduce a novel approach for coarse pose estimation which leverages a network trained to classify whether the pose error between a synthetic rendering and an observed image of the same object can be corrected by the refiner.
arXiv Detail & Related papers (2022-12-13T19:30:03Z) - Learning Geometry-Guided Depth via Projective Modeling for Monocular 3D Object Detection [70.71934539556916]
We learn geometry-guided depth estimation with projective modeling to advance monocular 3D object detection.
Specifically, a principled geometry formula with projective modeling of 2D and 3D depth predictions in the monocular 3D object detection network is devised.
Our method remarkably improves the detection performance of the state-of-the-art monocular-based method without extra data by 2.80% on the moderate test setting.
arXiv Detail & Related papers (2021-07-29T12:30:39Z) - DONet: Learning Category-Level 6D Object Pose and Size Estimation from
Depth Observation [53.55300278592281]
We propose a method of Category-level 6D Object Pose and Size Estimation (COPSE) from a single depth image.
Our framework makes inferences based on the rich geometric information of the object in the depth channel alone.
Our framework competes with state-of-the-art approaches that require labeled real-world images.
arXiv Detail & Related papers (2021-06-27T10:41:50Z) - FisheyeSuperPoint: Keypoint Detection and Description Network for
Fisheye Images [2.187613144178315]
Keypoint detection and description is a commonly used building block in computer vision systems.
SuperPoint is a self-supervised keypoint detector and descriptor that has achieved state-of-the-art results on homography estimation.
We introduce a fisheye adaptation pipeline to enable training on undistorted fisheye images.
arXiv Detail & Related papers (2021-02-27T11:26:34Z) - Supervised Training of Dense Object Nets using Optimal Descriptors for
Industrial Robotic Applications [57.87136703404356]
Dense Object Nets (DONs) by Florence, Manuelli and Tedrake introduced dense object descriptors as a novel visual object representation for the robotics community.
In this paper we show that given a 3D model of an object, we can generate its descriptor space image, which allows for supervised training of DONs.
We compare the training methods on generating 6D grasps for industrial objects and show that our novel supervised training approach improves the pick-and-place performance in industry-relevant tasks.
arXiv Detail & Related papers (2021-02-16T11:40:12Z) - Neural Ray Surfaces for Self-Supervised Learning of Depth and Ego-motion [51.19260542887099]
We show that self-supervision can be used to learn accurate depth and ego-motion estimation without prior knowledge of the camera model.
Inspired by the geometric model of Grossberg and Nayar, we introduce Neural Ray Surfaces (NRS), convolutional networks that represent pixel-wise projection rays.
We demonstrate the use of NRS for self-supervised learning of visual odometry and depth estimation from raw videos obtained using a wide variety of camera systems.
arXiv Detail & Related papers (2020-08-15T02:29:13Z) - SynDistNet: Self-Supervised Monocular Fisheye Camera Distance Estimation
Synergized with Semantic Segmentation for Autonomous Driving [37.50089104051591]
State-of-the-art self-supervised learning approaches for monocular depth estimation usually suffer from scale ambiguity.
This paper introduces a novel multi-task learning strategy to improve self-supervised monocular distance estimation on fisheye and pinhole camera images.
arXiv Detail & Related papers (2020-08-10T10:52:47Z) - DeepURL: Deep Pose Estimation Framework for Underwater Relative
Localization [21.096166727043077]
We propose a real-time deep learning approach for determining the 6D relative pose of Autonomous Underwater Vehicles (AUV) from a single image.
An image-to-image translation network is employed to bridge the gap between the rendered real images producing synthetic images for training.
arXiv Detail & Related papers (2020-03-11T21:11:05Z) - 3D Object Detection from a Single Fisheye Image Without a Single Fisheye
Training Image [7.86363825307044]
We show how to use existing monocular 3D object detection models, trained only on rectilinear images, to detect 3D objects in images from fisheye cameras.
We outperform the only existing method for monocular 3D object detection in panoramas on a benchmark of synthetic data.
arXiv Detail & Related papers (2020-03-08T11:03:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.