Related papers: DeepURL: Deep Pose Estimation Framework for Underwater Relative Localization

DeepURL: Deep Pose Estimation Framework for Underwater Relative Localization

URL: http://arxiv.org/abs/2003.05523v4
Date: Thu, 21 Jan 2021 18:10:20 GMT
Title: DeepURL: Deep Pose Estimation Framework for Underwater Relative Localization
Authors: Bharat Joshi, Md Modasshir, Travis Manderson, Hunter Damron, Marios Xanthidis, Alberto Quattrini Li, Ioannis Rekleitis, Gregory Dudek
Abstract summary: We propose a real-time deep learning approach for determining the 6D relative pose of Autonomous Underwater Vehicles (AUV) from a single image. An image-to-image translation network is employed to bridge the gap between the rendered real images producing synthetic images for training.
Score: 21.096166727043077
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we propose a real-time deep learning approach for determining the 6D relative pose of Autonomous Underwater Vehicles (AUV) from a single image. A team of autonomous robots localizing themselves in a communication-constrained underwater environment is essential for many applications such as underwater exploration, mapping, multi-robot convoying, and other multi-robot tasks. Due to the profound difficulty of collecting ground truth images with accurate 6D poses underwater, this work utilizes rendered images from the Unreal Game Engine simulation for training. An image-to-image translation network is employed to bridge the gap between the rendered and the real images producing synthetic images for training. The proposed method predicts the 6D pose of an AUV from a single image as 2D image keypoints representing 8 corners of the 3D model of the AUV, and then the 6D pose in the camera coordinates is determined using RANSAC-based PnP. Experimental results in real-world underwater environments (swimming pool and ocean) with different cameras demonstrate the robustness and accuracy of the proposed technique in terms of translation error and orientation error over the state-of-the-art methods. The code is publicly available.

Related papers

Any6D: Model-free 6D Pose Estimation of Novel Objects [76.30057578269668]
We introduce Any6D, a model-free framework for 6D object pose estimation. It requires only a single RGB-D anchor image to estimate both the 6D pose and size of unknown objects in novel scenes. We evaluate our method on five challenging datasets.
arXiv Detail & Related papers (2025-03-24T13:46:21Z)
FAFA: Frequency-Aware Flow-Aided Self-Supervision for Underwater Object Pose Estimation [65.01601309903971]
We introduce FAFA, a Frequency-Aware Flow-Aided self-supervised framework for 6D pose estimation of unmanned underwater vehicles (UUVs) Our framework relies solely on the 3D model and RGB images, alleviating the need for any real pose annotations or other-modality data like depths. We evaluate the effectiveness of FAFA on common underwater object pose benchmarks and showcase significant performance improvements compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-09-25T03:54:01Z)
Model-Based Underwater 6D Pose Estimation from RGB [1.9160624126555885]
We propose an approach that leverages 2D object detection to reliably compute 6D pose estimates in different underwater scenarios. All objects and scenes are made available in an open-source dataset that includes annotations for object detection and pose estimation.
arXiv Detail & Related papers (2023-02-14T04:27:03Z)
Towards Hard-pose Virtual Try-on via 3D-aware Global Correspondence Learning [70.75369367311897]
3D-aware global correspondences are reliable flows that jointly encode global semantic correlations, local deformations, and geometric priors of 3D human bodies. An adversarial generator takes the garment warped by the 3D-aware flow, and the image of the target person as inputs, to synthesize the photo-realistic try-on result.
arXiv Detail & Related papers (2022-11-25T12:16:21Z)
Shape, Pose, and Appearance from a Single Image via Bootstrapped Radiance Field Inversion [54.151979979158085]
We introduce a principled end-to-end reconstruction framework for natural images, where accurate ground-truth poses are not available. We leverage an unconditional 3D-aware generator, to which we apply a hybrid inversion scheme where a model produces a first guess of the solution. Our framework can de-render an image in as few as 10 steps, enabling its use in practical scenarios.
arXiv Detail & Related papers (2022-11-21T17:42:42Z)
Learning 6D Pose Estimation from Synthetic RGBD Images for Robotic Applications [0.6299766708197883]
The proposed pipeline can efficiently generate large amounts of photo-realistic RGBD images for the object of interest. We develop a real-time two-stage 6D pose estimation approach by integrating the object detector YOLO-V4-tiny and the 6D pose estimation algorithm PVN3D. The resulting network shows competitive performance compared to state-of-the-art methods when evaluated on LineMod dataset.
arXiv Detail & Related papers (2022-08-30T14:17:15Z)
Semi-Perspective Decoupled Heatmaps for 3D Robot Pose Estimation from Depth Maps [66.24554680709417]
Knowing the exact 3D location of workers and robots in a collaborative environment enables several real applications. We propose a non-invasive framework based on depth devices and deep neural networks to estimate the 3D pose of robots from an external camera.
arXiv Detail & Related papers (2022-07-06T08:52:12Z)
Perspective Flow Aggregation for Data-Limited 6D Object Pose Estimation [121.02948087956955]
For some applications, such as those in space or deep under water, acquiring real images, even unannotated, is virtually impossible. We propose a method that can be trained solely on synthetic images, or optionally using a few additional real images. It performs on par with methods that require annotated real images for training when not using any, and outperforms them considerably when using as few as twenty real images.
arXiv Detail & Related papers (2022-03-18T10:20:21Z)
Where to drive: free space detection with one fisheye camera [1.7499351967216341]
We propose to use synthetic training data based on Unity3D. A five-pass algorithm is used to create a virtual fisheye camera. The results indicate that synthetic fisheye images can be used in deep learning context.
arXiv Detail & Related papers (2020-11-11T14:36:45Z)
Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled Representation [57.11299763566534]
We present a solution to recover 3D pose from multi-view images captured with spatially calibrated cameras. We exploit 3D geometry to fuse input images into a unified latent representation of pose, which is disentangled from camera view-points. Our architecture then conditions the learned representation on camera projection operators to produce accurate per-view 2d detections.
arXiv Detail & Related papers (2020-04-05T12:52:29Z)
SilhoNet-Fisheye: Adaptation of A ROI Based Object Pose Estimation Network to Monocular Fisheye Images [15.573003283204958]
We present a novel framework for adapting a ROI-based 6D object pose estimation method to work on full fisheye images. We also contribute a fisheye image dataset, called UWHandles, with 6D object pose and 2D bounding box annotations.
arXiv Detail & Related papers (2020-02-27T19:57:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.