iSimLoc: Visual Global Localization for Previously Unseen Environments
with Simulated Images
- URL: http://arxiv.org/abs/2209.06376v1
- Date: Wed, 14 Sep 2022 02:40:50 GMT
- Title: iSimLoc: Visual Global Localization for Previously Unseen Environments
with Simulated Images
- Authors: Peng Yin, Ivan Cisneros, Ji Zhang, Howie Choset, and Sebastian Scherer
- Abstract summary: This paper presents iSimLoc, a consistent hierarchical global re-localization approach.
Place features of iSimLoc can be utilized to search target images under changing appearances and viewpoints.
We evaluate our method on one dataset with appearance variations and one dataset that focuses on demonstrating large-scale matching over a long flight.
- Score: 21.43167626240771
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The visual camera is an attractive device in beyond visual line of sight
(B-VLOS) drone operation, since they are low in size, weight, power, and cost,
and can provide redundant modality to GPS failures. However, state-of-the-art
visual localization algorithms are unable to match visual data that have a
significantly different appearance due to illuminations or viewpoints. This
paper presents iSimLoc, a condition/viewpoint consistent hierarchical global
re-localization approach. The place features of iSimLoc can be utilized to
search target images under changing appearances and viewpoints. Additionally,
our hierarchical global re-localization module refines in a coarse-to-fine
manner, allowing iSimLoc to perform a fast and accurate estimation. We evaluate
our method on one dataset with appearance variations and one dataset that
focuses on demonstrating large-scale matching over a long flight in complicated
environments. On our two datasets, iSimLoc achieves 88.7\% and 83.8\%
successful retrieval rates with 1.5s inferencing time, compared to 45.8% and
39.7% using the next best method. These results demonstrate robust localization
in a range of environments.
Related papers
- Swarm Intelligence in Geo-Localization: A Multi-Agent Large Vision-Language Model Collaborative Framework [51.26566634946208]
We introduce smileGeo, a novel visual geo-localization framework.
By inter-agent communication, smileGeo integrates the inherent knowledge of these agents with additional retrieved information.
Results show that our approach significantly outperforms current state-of-the-art methods.
arXiv Detail & Related papers (2024-08-21T03:31:30Z) - AIR-HLoc: Adaptive Retrieved Images Selection for Efficient Visual Localisation [8.789742514363777]
State-of-the-art hierarchical localisation pipelines (HLoc) employ image retrieval (IR) to establish 2D-3D correspondences.
This paper investigates the relationship between global and local descriptors.
We propose an adaptive strategy that adjusts $k$ based on the similarity between the query's global descriptor and those in the database.
arXiv Detail & Related papers (2024-03-27T06:17:21Z) - GeoCLIP: Clip-Inspired Alignment between Locations and Images for
Effective Worldwide Geo-localization [61.10806364001535]
Worldwide Geo-localization aims to pinpoint the precise location of images taken anywhere on Earth.
Existing approaches divide the globe into discrete geographic cells, transforming the problem into a classification task.
We propose GeoCLIP, a novel CLIP-inspired Image-to-GPS retrieval approach that enforces alignment between the image and its corresponding GPS locations.
arXiv Detail & Related papers (2023-09-27T20:54:56Z) - LoLep: Single-View View Synthesis with Locally-Learned Planes and
Self-Attention Occlusion Inference [66.45326873274908]
We propose a novel method, LoLep, which regresses Locally-Learned planes from a single RGB image to represent scenes accurately.
Compared to MINE, our approach has an LPIPS reduction of 4.8%-9.0% and an RV reduction of 73.9%-83.5%.
arXiv Detail & Related papers (2023-07-23T03:38:55Z) - Cross-View Visual Geo-Localization for Outdoor Augmented Reality [11.214903134756888]
We address the problem of geo-pose estimation by cross-view matching of query ground images to a geo-referenced aerial satellite image database.
We propose a new transformer neural network-based model and a modified triplet ranking loss for joint location and orientation estimation.
Experiments on several benchmark cross-view geo-localization datasets show that our model achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-03-28T01:58:03Z) - 4Seasons: Benchmarking Visual SLAM and Long-Term Localization for
Autonomous Driving in Challenging Conditions [54.59279160621111]
We present a novel visual SLAM and long-term localization benchmark for autonomous driving in challenging conditions based on the large-scale 4Seasons dataset.
The proposed benchmark provides drastic appearance variations caused by seasonal changes and diverse weather and illumination conditions.
We introduce a new unified benchmark for jointly evaluating visual odometry, global place recognition, and map-based visual localization performance.
arXiv Detail & Related papers (2022-12-31T13:52:36Z) - Beyond Cross-view Image Retrieval: Highly Accurate Vehicle Localization
Using Satellite Image [91.29546868637911]
This paper addresses the problem of vehicle-mounted camera localization by matching a ground-level image with an overhead-view satellite map.
The key idea is to formulate the task as pose estimation and solve it by neural-net based optimization.
Experiments on standard autonomous vehicle localization datasets have confirmed the superiority of the proposed method.
arXiv Detail & Related papers (2022-04-10T19:16:58Z) - CrossLoc: Scalable Aerial Localization Assisted by Multimodal Synthetic
Data [2.554905387213586]
We present a visual localization system that learns to estimate camera poses in the real world with the help of synthetic data.
To mitigate the data scarcity issue, we introduce TOPO-DataGen, a versatile synthetic data generation tool.
We also introduce CrossLoc, a cross-modal visual representation learning approach to pose estimation.
arXiv Detail & Related papers (2021-12-16T18:05:48Z) - Domain-invariant Similarity Activation Map Contrastive Learning for
Retrieval-based Long-term Visual Localization [30.203072945001136]
In this work, a general architecture is first formulated probabilistically to extract domain invariant feature through multi-domain image translation.
And then a novel gradient-weighted similarity activation mapping loss (Grad-SAM) is incorporated for finer localization with high accuracy.
Extensive experiments have been conducted to validate the effectiveness of the proposed approach on the CMUSeasons dataset.
Our performance is on par with or even outperforms the state-of-the-art image-based localization baselines in medium or high precision.
arXiv Detail & Related papers (2020-09-16T14:43:22Z) - Learning Condition Invariant Features for Retrieval-Based Localization
from 1M Images [85.81073893916414]
We develop a novel method for learning more accurate and better generalizing localization features.
On the challenging Oxford RobotCar night condition, our method outperforms the well-known triplet loss by 24.4% in localization accuracy within 5m.
arXiv Detail & Related papers (2020-08-27T14:46:22Z) - Robust Image Retrieval-based Visual Localization using Kapture [10.249293519246478]
We present a versatile pipeline for visual localization that facilitates the use of different local and global features.
We evaluate our methods on eight public datasets where they rank top on all and first on many of them.
To foster future research, we release code, models, and all datasets used in this paper in the kapture format open source under a permissive BSD license.
arXiv Detail & Related papers (2020-07-27T21:10:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.