Related papers: CoordiNet: uncertainty-aware pose regressor for reliable vehicle localization

CoordiNet: uncertainty-aware pose regressor for reliable vehicle localization

URL: http://arxiv.org/abs/2103.10796v1
Date: Fri, 19 Mar 2021 13:32:40 GMT
Title: CoordiNet: uncertainty-aware pose regressor for reliable vehicle localization
Authors: Arthur Moreau, Nathan Piasco, Dzmitry Tsishkou, Bogdan Stanciulescu, Arnaud de La Fortelle
Abstract summary: We investigate visual-based camera localization with neural networks for robotics and autonomous vehicles applications. Our solution is a CNN-based algorithm which predicts camera pose directly from a single image. We show that our proposal is a reliable alternative, achieving 29cm median error in a 1.9km loop in a busy urban area.
Score: 3.4386226615580107
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: In this paper, we investigate visual-based camera localization with neural networks for robotics and autonomous vehicles applications. Our solution is a CNN-based algorithm which predicts camera pose (3D translation and 3D rotation) directly from a single image. It also provides an uncertainty estimate of the pose. Pose and uncertainty are learned together with a single loss function. Furthermore, we propose a new fully convolutional architecture, named CoordiNet, designed to embed some of the scene geometry. Our framework outperforms comparable methods on the largest available benchmark, the Oxford RobotCar dataset, with an average error of 8 meters where previous best was 19 meters. We have also investigated the performance of our method on large scenes for real time (18 fps) vehicle localization. In this setup, structure-based methods require a large database, and we show that our proposal is a reliable alternative, achieving 29cm median error in a 1.9km loop in a busy urban area.

Related papers

CabiNet: Scaling Neural Collision Detection for Object Rearrangement with Procedural Scene Generation [54.68738348071891]
We first generate over 650K cluttered scenes - orders of magnitude more than prior work - in diverse everyday environments. We render synthetic partial point clouds from this data and use it to train our CabiNet model architecture. CabiNet is a collision model that accepts object and scene point clouds, captured from a single-view depth observation.
arXiv Detail & Related papers (2023-04-18T21:09:55Z)
Collaboration Helps Camera Overtake LiDAR in 3D Detection [49.58433319402405]
Camera-only 3D detection provides a simple solution for localizing objects in 3D space compared to LiDAR-based detection systems. Our proposed collaborative camera-only 3D detection (CoCa3D) enables agents to share complementary information with each other through communication. Results show that CoCa3D improves previous SOTA performances by 44.21% on DAIR-V2X, 30.60% on OPV2V+, 12.59% on CoPerception-UAVs+ for AP@70.
arXiv Detail & Related papers (2023-03-23T03:50:41Z)
Fast and Lightweight Scene Regressor for Camera Relocalization [1.6708069984516967]
Estimating the camera pose directly with respect to pre-built 3D models can be prohibitively expensive for several applications. This study proposes a simple scene regression method that requires only a multi-layer perceptron network for mapping scene coordinates. The proposed approach uses sparse descriptors to regress the scene coordinates, instead of a dense RGB image.
arXiv Detail & Related papers (2022-12-04T14:41:20Z)
Neural Scene Representation for Locomotion on Structured Terrain [56.48607865960868]
We propose a learning-based method to reconstruct the local terrain for a mobile robot traversing urban environments. Using a stream of depth measurements from the onboard cameras and the robot's trajectory, the estimates the topography in the robot's vicinity. We propose a 3D reconstruction model that faithfully reconstructs the scene, despite the noisy measurements and large amounts of missing data coming from the blind spots of the camera arrangement.
arXiv Detail & Related papers (2022-06-16T10:45:17Z)
FvOR: Robust Joint Shape and Pose Optimization for Few-view Object Reconstruction [37.81077373162092]
Reconstructing an accurate 3D object model from a few image observations remains a challenging problem in computer vision. We present FvOR, a learning-based object reconstruction method that predicts accurate 3D models given a few images with noisy input poses.
arXiv Detail & Related papers (2022-05-16T15:39:27Z)
Beyond Cross-view Image Retrieval: Highly Accurate Vehicle Localization Using Satellite Image [91.29546868637911]
This paper addresses the problem of vehicle-mounted camera localization by matching a ground-level image with an overhead-view satellite map. The key idea is to formulate the task as pose estimation and solve it by neural-net based optimization. Experiments on standard autonomous vehicle localization datasets have confirmed the superiority of the proposed method.
arXiv Detail & Related papers (2022-04-10T19:16:58Z)
Monocular Camera Localization for Automated Vehicles Using Image Retrieval [8.594652891734288]
We address the problem of finding the current position and heading angle of an autonomous vehicle in real-time using a single camera. Compared to methods which require LiDARs and high definition (HD) 3D maps in real-time, the proposed approach is easily scalable and computationally efficient.
arXiv Detail & Related papers (2021-09-13T20:12:42Z)
Robust 2D/3D Vehicle Parsing in CVIS [54.825777404511605]
We present a novel approach to robustly detect and perceive vehicles in different camera views as part of a cooperative vehicle-infrastructure system (CVIS) Our formulation is designed for arbitrary camera views and makes no assumptions about intrinsic or extrinsic parameters. In practice, our approach outperforms SOTA methods on 2D detection, instance segmentation, and 6-DoF pose estimation.
arXiv Detail & Related papers (2021-03-11T03:35:05Z)
ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object Detection [69.68263074432224]
We present a novel framework named ZoomNet for stereo imagery-based 3D detection. The pipeline of ZoomNet begins with an ordinary 2D object detection model which is used to obtain pairs of left-right bounding boxes. To further exploit the abundant texture cues in RGB images for more accurate disparity estimation, we introduce a conceptually straight-forward module -- adaptive zooming.
arXiv Detail & Related papers (2020-03-01T17:18:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.