Related papers: Markov Localisation using Heatmap Regression and Deep Convolutional Odometry

Markov Localisation using Heatmap Regression and Deep Convolutional Odometry

URL: http://arxiv.org/abs/2106.00371v1
Date: Tue, 1 Jun 2021 10:28:49 GMT
Title: Markov Localisation using Heatmap Regression and Deep Convolutional Odometry
Authors: Oscar Mendez, Simon Hadfield, Richard Bowden
Abstract summary: We present a novel CNN-based localisation approach that can leverage modern deep learning hardware. We create a hybrid CNN that can perform image-based localisation and odometry-based likelihood propagation within a single neural network.
Score: 59.33322623437816
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the context of self-driving vehicles there is strong competition between approaches based on visual localisation and LiDAR. While LiDAR provides important depth information, it is sparse in resolution and expensive. On the other hand, cameras are low-cost and recent developments in deep learning mean they can provide high localisation performance. However, several fundamental problems remain, particularly in the domain of uncertainty, where learning based approaches can be notoriously over-confident. Markov, or grid-based, localisation was an early solution to the localisation problem but fell out of favour due to its computational complexity. Representing the likelihood field as a grid (or volume) means there is a trade off between accuracy and memory size. Furthermore, it is necessary to perform expensive convolutions across the entire likelihood volume. Despite the benefit of simultaneously maintaining a likelihood for all possible locations, grid based approaches were superseded by more efficient particle filters and Monte Carlo Localisation (MCL). However, MCL introduces its own problems e.g. particle deprivation. Recent advances in deep learning hardware allow large likelihood volumes to be stored directly on the GPU, along with the hardware necessary to efficiently perform GPU-bound 3D convolutions and this obviates many of the disadvantages of grid based methods. In this work, we present a novel CNN-based localisation approach that can leverage modern deep learning hardware. By implementing a grid-based Markov localisation approach directly on the GPU, we create a hybrid CNN that can perform image-based localisation and odometry-based likelihood propagation within a single neural network. The resulting approach is capable of outperforming direct pose regression methods as well as state-of-the-art localisation systems.

Related papers

PixelCAM: Pixel Class Activation Mapping for Histology Image Classification and ROI Localization [7.869923456842283]
Weakly supervised object localization (WSOL) methods allow training models to classify images and localize ROIs. Standard WSOL methods rely on class activation mapping (CAM) methods to produce spatial localization maps according to a single- or two-step strategy. We propose PixelCAM, a cost-effective foreground/background pixel-wise classifier in the pixel-feature space that allows for spatial object localization.
arXiv Detail & Related papers (2025-03-31T14:18:01Z)
BEVDiffLoc: End-to-End LiDAR Global Localization in BEV View based on Diffusion Model [8.720833232645155]
Bird's-Eye-View (BEV) image is one of the most widely adopted data representations in autonomous driving. We propose BEVDiffLoc, a novel framework that formulates LiDAR localization as a conditional generation of poses.
arXiv Detail & Related papers (2025-03-14T13:17:43Z)
TeD-Loc: Text Distillation for Weakly Supervised Object Localization [13.412674368913747]
TeD-Loc is an approach that distills knowledge from CLIP text embeddings into the model backbone and produces patch-level localization. It improves Top-1 LOC accuracy over state-of-the-art models by about 5% on both CUB and ILSVRC datasets.
arXiv Detail & Related papers (2025-01-22T04:36:17Z)
MapLocNet: Coarse-to-Fine Feature Registration for Visual Re-Localization in Navigation Maps [8.373285397029884]
Traditional localization approaches rely on high-definition (HD) maps, which consist of precisely annotated landmarks. We propose a novel transformer-based neural re-localization method, inspired by image registration. Our method significantly outperforms the current state-of-the-art OrienterNet on both the nuScenes and Argoverse datasets.
arXiv Detail & Related papers (2024-07-11T14:51:18Z)
GLACE: Global Local Accelerated Coordinate Encoding [66.87005863868181]
Scene coordinate regression methods are effective in small-scale scenes but face significant challenges in large-scale scenes. We propose GLACE, which integrates pre-trained global and local encodings and enables SCR to scale to large scenes with only a single small-sized network. Our method achieves state-of-the-art results on large-scale scenes with a low-map-size model.
arXiv Detail & Related papers (2024-06-06T17:59:50Z)
Deep Homography Estimation for Visual Place Recognition [49.235432979736395]
We propose a transformer-based deep homography estimation (DHE) network. It takes the dense feature map extracted by a backbone network as input and fits homography for fast and learnable geometric verification. Experiments on benchmark datasets show that our method can outperform several state-of-the-art methods.
arXiv Detail & Related papers (2024-02-25T13:22:17Z)
Towards Compact 3D Representations via Point Feature Enhancement Masked Autoencoders [52.66195794216989]
We propose Point Feature Enhancement Masked Autoencoders (Point-FEMAE) to learn compact 3D representations. Point-FEMAE consists of a global branch and a local branch to capture latent semantic features. Our method significantly improves the pre-training efficiency compared to cross-modal alternatives.
arXiv Detail & Related papers (2023-12-17T14:17:05Z)
Weakly Supervised Object Localization via Transformer with Implicit Spatial Calibration [20.322494442959762]
Weakly Supervised Object Localization (WSOL) has attracted much attention because of its low annotation cost in real applications. We introduce a simple yet effective Spatial Module (SCM) for accurate WSOL, incorporating semantic similarities of patch tokens and their spatial relationships into a unified diffusion model. SCM is designed as an external module of Transformer, and can be removed during inference to reduce the computation cost.
arXiv Detail & Related papers (2022-07-21T12:37:15Z)
Progressive Coordinate Transforms for Monocular 3D Object Detection [52.00071336733109]
We propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations. In this paper, we propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations.
arXiv Detail & Related papers (2021-08-12T15:22:33Z)
Focus on Local: Detecting Lane Marker from Bottom Up via Key Point [10.617793053931964]
We propose a novel lane marker detection solution, FOLOLane, that focuses on modeling local patterns and achieving prediction of global structures. Specifically, the CNN models lowcomplexity local patterns with two separate heads, the first one predicts the existence of key points, and the second refines the location of key points in the local range and correlates key points of the same lane line.
arXiv Detail & Related papers (2021-05-28T08:59:14Z)
Zero-Shot Multi-View Indoor Localization via Graph Location Networks [66.05980368549928]
indoor localization is a fundamental problem in location-based applications. We propose a novel neural network based architecture Graph Location Networks (GLN) to perform infrastructure-free, multi-view image based indoor localization. GLN makes location predictions based on robust location representations extracted from images through message-passing networks. We introduce a novel zero-shot indoor localization setting and tackle it by extending the proposed GLN to a dedicated zero-shot version.
arXiv Detail & Related papers (2020-08-06T07:36:55Z)
CMRNet++: Map and Camera Agnostic Monocular Visual Localization in LiDAR Maps [10.578312278413199]
CMRNet++ is a more robust model that generalizes to new places effectively and is also independent of the camera parameters. We demonstrate the ability of a deep learning approach to accurately localize without any retraining or fine-tuning in a completely new environment.
arXiv Detail & Related papers (2020-04-20T10:10:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.