DeepURL: Deep Pose Estimation Framework for Underwater Relative
Localization
- URL: http://arxiv.org/abs/2003.05523v4
- Date: Thu, 21 Jan 2021 18:10:20 GMT
- Title: DeepURL: Deep Pose Estimation Framework for Underwater Relative
Localization
- Authors: Bharat Joshi, Md Modasshir, Travis Manderson, Hunter Damron, Marios
Xanthidis, Alberto Quattrini Li, Ioannis Rekleitis, Gregory Dudek
- Abstract summary: We propose a real-time deep learning approach for determining the 6D relative pose of Autonomous Underwater Vehicles (AUV) from a single image.
An image-to-image translation network is employed to bridge the gap between the rendered real images producing synthetic images for training.
- Score: 21.096166727043077
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a real-time deep learning approach for determining
the 6D relative pose of Autonomous Underwater Vehicles (AUV) from a single
image. A team of autonomous robots localizing themselves in a
communication-constrained underwater environment is essential for many
applications such as underwater exploration, mapping, multi-robot convoying,
and other multi-robot tasks. Due to the profound difficulty of collecting
ground truth images with accurate 6D poses underwater, this work utilizes
rendered images from the Unreal Game Engine simulation for training. An
image-to-image translation network is employed to bridge the gap between the
rendered and the real images producing synthetic images for training. The
proposed method predicts the 6D pose of an AUV from a single image as 2D image
keypoints representing 8 corners of the 3D model of the AUV, and then the 6D
pose in the camera coordinates is determined using RANSAC-based PnP.
Experimental results in real-world underwater environments (swimming pool and
ocean) with different cameras demonstrate the robustness and accuracy of the
proposed technique in terms of translation error and orientation error over the
state-of-the-art methods. The code is publicly available.
Related papers
- FAFA: Frequency-Aware Flow-Aided Self-Supervision for Underwater Object Pose Estimation [65.01601309903971]
We introduce FAFA, a Frequency-Aware Flow-Aided self-supervised framework for 6D pose estimation of unmanned underwater vehicles (UUVs)
Our framework relies solely on the 3D model and RGB images, alleviating the need for any real pose annotations or other-modality data like depths.
We evaluate the effectiveness of FAFA on common underwater object pose benchmarks and showcase significant performance improvements compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-09-25T03:54:01Z) - Model-Based Underwater 6D Pose Estimation from RGB [1.9160624126555885]
We propose an approach that leverages 2D object detection to reliably compute 6D pose estimates in different underwater scenarios.
All objects and scenes are made available in an open-source dataset that includes annotations for object detection and pose estimation.
arXiv Detail & Related papers (2023-02-14T04:27:03Z) - Towards Hard-pose Virtual Try-on via 3D-aware Global Correspondence
Learning [70.75369367311897]
3D-aware global correspondences are reliable flows that jointly encode global semantic correlations, local deformations, and geometric priors of 3D human bodies.
An adversarial generator takes the garment warped by the 3D-aware flow, and the image of the target person as inputs, to synthesize the photo-realistic try-on result.
arXiv Detail & Related papers (2022-11-25T12:16:21Z) - Shape, Pose, and Appearance from a Single Image via Bootstrapped
Radiance Field Inversion [54.151979979158085]
We introduce a principled end-to-end reconstruction framework for natural images, where accurate ground-truth poses are not available.
We leverage an unconditional 3D-aware generator, to which we apply a hybrid inversion scheme where a model produces a first guess of the solution.
Our framework can de-render an image in as few as 10 steps, enabling its use in practical scenarios.
arXiv Detail & Related papers (2022-11-21T17:42:42Z) - Learning 6D Pose Estimation from Synthetic RGBD Images for Robotic
Applications [0.6299766708197883]
The proposed pipeline can efficiently generate large amounts of photo-realistic RGBD images for the object of interest.
We develop a real-time two-stage 6D pose estimation approach by integrating the object detector YOLO-V4-tiny and the 6D pose estimation algorithm PVN3D.
The resulting network shows competitive performance compared to state-of-the-art methods when evaluated on LineMod dataset.
arXiv Detail & Related papers (2022-08-30T14:17:15Z) - Semi-Perspective Decoupled Heatmaps for 3D Robot Pose Estimation from
Depth Maps [66.24554680709417]
Knowing the exact 3D location of workers and robots in a collaborative environment enables several real applications.
We propose a non-invasive framework based on depth devices and deep neural networks to estimate the 3D pose of robots from an external camera.
arXiv Detail & Related papers (2022-07-06T08:52:12Z) - Perspective Flow Aggregation for Data-Limited 6D Object Pose Estimation [121.02948087956955]
For some applications, such as those in space or deep under water, acquiring real images, even unannotated, is virtually impossible.
We propose a method that can be trained solely on synthetic images, or optionally using a few additional real images.
It performs on par with methods that require annotated real images for training when not using any, and outperforms them considerably when using as few as twenty real images.
arXiv Detail & Related papers (2022-03-18T10:20:21Z) - Where to drive: free space detection with one fisheye camera [1.7499351967216341]
We propose to use synthetic training data based on Unity3D.
A five-pass algorithm is used to create a virtual fisheye camera.
The results indicate that synthetic fisheye images can be used in deep learning context.
arXiv Detail & Related papers (2020-11-11T14:36:45Z) - Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled
Representation [57.11299763566534]
We present a solution to recover 3D pose from multi-view images captured with spatially calibrated cameras.
We exploit 3D geometry to fuse input images into a unified latent representation of pose, which is disentangled from camera view-points.
Our architecture then conditions the learned representation on camera projection operators to produce accurate per-view 2d detections.
arXiv Detail & Related papers (2020-04-05T12:52:29Z) - SilhoNet-Fisheye: Adaptation of A ROI Based Object Pose Estimation
Network to Monocular Fisheye Images [15.573003283204958]
We present a novel framework for adapting a ROI-based 6D object pose estimation method to work on full fisheye images.
We also contribute a fisheye image dataset, called UWHandles, with 6D object pose and 2D bounding box annotations.
arXiv Detail & Related papers (2020-02-27T19:57:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.