SNAP: Self-Supervised Neural Maps for Visual Positioning and Semantic
Understanding
- URL: http://arxiv.org/abs/2306.05407v2
- Date: Wed, 1 Nov 2023 17:59:40 GMT
- Title: SNAP: Self-Supervised Neural Maps for Visual Positioning and Semantic
Understanding
- Authors: Paul-Edouard Sarlin, Eduard Trulls, Marc Pollefeys, Jan Hosang, Simon
Lynen
- Abstract summary: We introduce SNAP, a deep network that learns rich neural 2D maps from ground-level and overhead images.
We train our model to align neural maps estimated from different inputs, supervised only with camera poses over tens of millions of StreetView images.
SNAP can resolve the location of challenging image queries beyond the reach of traditional methods.
- Score: 57.108301842535894
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic 2D maps are commonly used by humans and machines for navigation
purposes, whether it's walking or driving. However, these maps have
limitations: they lack detail, often contain inaccuracies, and are difficult to
create and maintain, especially in an automated fashion. Can we use raw imagery
to automatically create better maps that can be easily interpreted by both
humans and machines? We introduce SNAP, a deep network that learns rich neural
2D maps from ground-level and overhead images. We train our model to align
neural maps estimated from different inputs, supervised only with camera poses
over tens of millions of StreetView images. SNAP can resolve the location of
challenging image queries beyond the reach of traditional methods,
outperforming the state of the art in localization by a large margin. Moreover,
our neural maps encode not only geometry and appearance but also high-level
semantics, discovered without explicit supervision. This enables effective
pre-training for data-efficient semantic scene understanding, with the
potential to unlock cost-efficient creation of more detailed maps.
Related papers
- DeepAerialMapper: Deep Learning-based Semi-automatic HD Map Creation for Highly Automated Vehicles [0.0]
We introduce a semi-automatic method for creating HD maps from high-resolution aerial imagery.
Our method involves training neural networks to semantically segment aerial images into classes relevant to HD maps.
Exporting the map to the Lanelet2 format allows easy extension for different use cases.
arXiv Detail & Related papers (2024-10-01T15:05:05Z) - MapVision: CVPR 2024 Autonomous Grand Challenge Mapless Driving Tech Report [6.598847563245353]
We found that most existing algorithms construct Bird's Eye View features from multi-perspective images.
These algorithms perform poorly at the far end of roads and struggle when the primary subject in the image is occluded.
In this competition, we not only used multi-perspective images as input but also incorporated SD maps to address this issue.
arXiv Detail & Related papers (2024-06-14T15:31:45Z) - Semantic Map-based Generation of Navigation Instructions [9.197756644049862]
We propose a new approach to navigation instruction generation by framing the problem as an image captioning task.
Conventional approaches employ a sequence of panorama images to generate navigation instructions.
We present a benchmark dataset for instruction generation using semantic maps, propose an initial model and ask human subjects to manually assess the quality of generated instructions.
arXiv Detail & Related papers (2024-03-28T17:27:44Z) - Object Goal Navigation with Recursive Implicit Maps [92.6347010295396]
We propose an implicit spatial map for object goal navigation.
Our method significantly outperforms the state of the art on the challenging MP3D dataset.
We deploy our model on a real robot and achieve encouraging object goal navigation results in real scenes.
arXiv Detail & Related papers (2023-08-10T14:21:33Z) - Neural Implicit Dense Semantic SLAM [83.04331351572277]
We propose a novel RGBD vSLAM algorithm that learns a memory-efficient, dense 3D geometry, and semantic segmentation of an indoor scene in an online manner.
Our pipeline combines classical 3D vision-based tracking and loop closing with neural fields-based mapping.
Our proposed algorithm can greatly enhance scene perception and assist with a range of robot control problems.
arXiv Detail & Related papers (2023-04-27T23:03:52Z) - OrienterNet: Visual Localization in 2D Public Maps with Neural Matching [21.673020132276573]
OrienterNet is the first deep neural network that can localize an image with sub-meter accuracy using the same 2D semantic maps that humans use.
OrienterNet estimates the location and orientation of a query image by matching a neural Bird's-Eye View with open and globally available maps from OpenStreetMap.
To enable this, we introduce a large crowd-sourced dataset of images captured across 12 cities from the diverse viewpoints of cars, bikes, and pedestrians.
arXiv Detail & Related papers (2023-04-04T17:59:03Z) - Semantic Image Alignment for Vehicle Localization [111.59616433224662]
We present a novel approach to vehicle localization in dense semantic maps using semantic segmentation from a monocular camera.
In contrast to existing visual localization approaches, the system does not require additional keypoint features, handcrafted localization landmark extractors or expensive LiDAR sensors.
arXiv Detail & Related papers (2021-10-08T14:40:15Z) - Canonical Saliency Maps: Decoding Deep Face Models [47.036036069156104]
We present 'Canonical Saliency Maps', a new method that highlights relevant facial areas by projecting saliency maps onto a canonical face model.
Our results show the usefulness of the proposed canonical saliency maps, which can be used on any deep face model regardless of the architecture.
arXiv Detail & Related papers (2021-05-04T09:42:56Z) - MP3: A Unified Model to Map, Perceive, Predict and Plan [84.07678019017644]
MP3 is an end-to-end approach to mapless driving where the input is raw sensor data and a high-level command.
We show that our approach is significantly safer, more comfortable, and can follow commands better than the baselines in challenging long-term closed-loop simulations.
arXiv Detail & Related papers (2021-01-18T00:09:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.