Visual and Object Geo-localization: A Comprehensive Survey
- URL: http://arxiv.org/abs/2112.15202v2
- Date: Wed, 11 Oct 2023 18:00:22 GMT
- Title: Visual and Object Geo-localization: A Comprehensive Survey
- Authors: Daniel Wilson, Xiaohan Zhang, Waqas Sultani, Safwan Wshah
- Abstract summary: Geo-localization refers to the process of determining where on earth some entity' is located.
This paper provides a comprehensive survey of geo-localization involving images, which involves either determining from where an image has been captured (Image geo-localization) or geo-locating objects within an image (Object geo-localization)
We will provide an in-depth study, including a summary of popular algorithms, a description of proposed datasets, and an analysis of performance results to illustrate the current state of each field.
- Score: 11.120155713865918
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The concept of geo-localization refers to the process of determining where on
earth some `entity' is located, typically using Global Positioning System (GPS)
coordinates. The entity of interest may be an image, sequence of images, a
video, satellite image, or even objects visible within the image. As massive
datasets of GPS tagged media have rapidly become available due to smartphones
and the internet, and deep learning has risen to enhance the performance
capabilities of machine learning models, the fields of visual and object
geo-localization have emerged due to its significant impact on a wide range of
applications such as augmented reality, robotics, self-driving vehicles, road
maintenance, and 3D reconstruction. This paper provides a comprehensive survey
of geo-localization involving images, which involves either determining from
where an image has been captured (Image geo-localization) or geo-locating
objects within an image (Object geo-localization). We will provide an in-depth
study, including a summary of popular algorithms, a description of proposed
datasets, and an analysis of performance results to illustrate the current
state of each field.
Related papers
- Swarm Intelligence in Geo-Localization: A Multi-Agent Large Vision-Language Model Collaborative Framework [51.26566634946208]
We introduce smileGeo, a novel visual geo-localization framework.
By inter-agent communication, smileGeo integrates the inherent knowledge of these agents with additional retrieved information.
Results show that our approach significantly outperforms current state-of-the-art methods.
arXiv Detail & Related papers (2024-08-21T03:31:30Z) - G3: An Effective and Adaptive Framework for Worldwide Geolocalization Using Large Multi-Modality Models [40.69217368870192]
We propose a novel framework for worldwide geolocalization based on Retrieval-Augmented Generation (RAG)
G3 consists of three steps, i.e., Geo-alignment, Geo-diversification, and Geo-verification.
Experiments on two well-established datasets verify the superiority of G3 compared to other state-of-the-art methods.
arXiv Detail & Related papers (2024-05-23T15:37:06Z) - GeoCLIP: Clip-Inspired Alignment between Locations and Images for
Effective Worldwide Geo-localization [61.10806364001535]
Worldwide Geo-localization aims to pinpoint the precise location of images taken anywhere on Earth.
Existing approaches divide the globe into discrete geographic cells, transforming the problem into a classification task.
We propose GeoCLIP, a novel CLIP-inspired Image-to-GPS retrieval approach that enforces alignment between the image and its corresponding GPS locations.
arXiv Detail & Related papers (2023-09-27T20:54:56Z) - GeoGLUE: A GeoGraphic Language Understanding Evaluation Benchmark [56.08664336835741]
We propose a GeoGraphic Language Understanding Evaluation benchmark, named GeoGLUE.
We collect data from open-released geographic resources and introduce six natural language understanding tasks.
We pro vide evaluation experiments and analysis of general baselines, indicating the effectiveness and significance of the GeoGLUE benchmark.
arXiv Detail & Related papers (2023-05-11T03:21:56Z) - G^3: Geolocation via Guidebook Grounding [92.46774241823562]
We study explicit knowledge from human-written guidebooks that describe the salient and class-discriminative visual features humans use for geolocation.
We propose the task of Geolocation via Guidebook Grounding that uses a dataset of StreetView images from a diverse set of locations.
Our approach substantially outperforms a state-of-the-art image-only geolocation method, with an improvement of over 5% in Top-1 accuracy.
arXiv Detail & Related papers (2022-11-28T16:34:40Z) - A General Purpose Neural Architecture for Geospatial Systems [142.43454584836812]
We present a roadmap towards the construction of a general-purpose neural architecture (GPNA) with a geospatial inductive bias.
We envision how such a model may facilitate cooperation between members of the community.
arXiv Detail & Related papers (2022-11-04T09:58:57Z) - Accurate 3-DoF Camera Geo-Localization via Ground-to-Satellite Image
Matching [102.39635336450262]
We address the problem of ground-to-satellite image geo-localization by matching a query image captured at the ground level against a large-scale database with geotagged satellite images.
Our new method is able to achieve the fine-grained location of a query image, up to pixel size precision of the satellite image.
arXiv Detail & Related papers (2022-03-26T20:10:38Z) - Context Aware Object Geotagging [2.4674307340652297]
We propose an approach to improve asset geolocation from street view imagery using Structure from Motion.
The predicted object geolocation is further refined by imposing contextual geographic information extracted from OpenStreetMap.
arXiv Detail & Related papers (2021-08-13T16:16:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.