G^3: Geolocation via Guidebook Grounding
- URL: http://arxiv.org/abs/2211.15521v1
- Date: Mon, 28 Nov 2022 16:34:40 GMT
- Title: G^3: Geolocation via Guidebook Grounding
- Authors: Grace Luo, Giscard Biamby, Trevor Darrell, Daniel Fried, Anna Rohrbach
- Abstract summary: We study explicit knowledge from human-written guidebooks that describe the salient and class-discriminative visual features humans use for geolocation.
We propose the task of Geolocation via Guidebook Grounding that uses a dataset of StreetView images from a diverse set of locations.
Our approach substantially outperforms a state-of-the-art image-only geolocation method, with an improvement of over 5% in Top-1 accuracy.
- Score: 92.46774241823562
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We demonstrate how language can improve geolocation: the task of predicting
the location where an image was taken. Here we study explicit knowledge from
human-written guidebooks that describe the salient and class-discriminative
visual features humans use for geolocation. We propose the task of Geolocation
via Guidebook Grounding that uses a dataset of StreetView images from a diverse
set of locations and an associated textual guidebook for GeoGuessr, a popular
interactive geolocation game. Our approach predicts a country for each image by
attending over the clues automatically extracted from the guidebook.
Supervising attention with country-level pseudo labels achieves the best
performance. Our approach substantially outperforms a state-of-the-art
image-only geolocation method, with an improvement of over 5% in Top-1
accuracy. Our dataset and code can be found at
https://github.com/g-luo/geolocation_via_guidebook_grounding.
Related papers
- G3: An Effective and Adaptive Framework for Worldwide Geolocalization Using Large Multi-Modality Models [40.69217368870192]
We propose a novel framework for worldwide geolocalization based on Retrieval-Augmented Generation (RAG)
G3 consists of three steps, i.e., Geo-alignment, Geo-diversification, and Geo-verification.
Experiments on two well-established datasets verify the superiority of G3 compared to other state-of-the-art methods.
arXiv Detail & Related papers (2024-05-23T15:37:06Z) - GeoCLIP: Clip-Inspired Alignment between Locations and Images for
Effective Worldwide Geo-localization [61.10806364001535]
Worldwide Geo-localization aims to pinpoint the precise location of images taken anywhere on Earth.
Existing approaches divide the globe into discrete geographic cells, transforming the problem into a classification task.
We propose GeoCLIP, a novel CLIP-inspired Image-to-GPS retrieval approach that enforces alignment between the image and its corresponding GPS locations.
arXiv Detail & Related papers (2023-09-27T20:54:56Z) - Geo-Encoder: A Chunk-Argument Bi-Encoder Framework for Chinese
Geographic Re-Ranking [61.60169764507917]
Chinese geographic re-ranking task aims to find the most relevant addresses among retrieved candidates.
We propose an innovative framework, namely Geo-Encoder, to more effectively integrate Chinese geographical semantics into re-ranking pipelines.
arXiv Detail & Related papers (2023-09-04T13:44:50Z) - PIGEON: Predicting Image Geolocations [44.99833362998488]
We present a new geolocalization system that combines semantic geocell creation, multi-task contrastive pretraining, and a novel loss function.
PIGEOTTO is the first image geolocalization model that effectively generalizes to unseen places.
arXiv Detail & Related papers (2023-07-11T23:36:49Z) - GeoGLUE: A GeoGraphic Language Understanding Evaluation Benchmark [56.08664336835741]
We propose a GeoGraphic Language Understanding Evaluation benchmark, named GeoGLUE.
We collect data from open-released geographic resources and introduce six natural language understanding tasks.
We pro vide evaluation experiments and analysis of general baselines, indicating the effectiveness and significance of the GeoGLUE benchmark.
arXiv Detail & Related papers (2023-05-11T03:21:56Z) - Where We Are and What We're Looking At: Query Based Worldwide Image
Geo-localization Using Hierarchies and Scenes [53.53712888703834]
We introduce an end-to-end transformer-based architecture that exploits the relationship between different geographic levels.
We achieve state of the art street level accuracy on 4 standard geo-localization datasets.
arXiv Detail & Related papers (2023-03-07T21:47:58Z) - Leveraging Selective Prediction for Reliable Image Geolocation [6.453278464902654]
We define the task of image localizability, i.e. suitability of an image for geolocation.
We propose a selective prediction methodology to address the task.
By abstaining from predicting non-localizable images, we improve geolocation accuracy from 27.8% to 70.5% at the city-scale.
arXiv Detail & Related papers (2021-11-23T15:46:12Z) - Hierarchical Attention Fusion for Geo-Localization [7.544917072241684]
We introduce a hierarchical attention fusion network using multi-scale features for geo-localization.
We extract the hierarchical feature maps from a convolutional neural network (CNN) and organically fuse the extracted features for image representations.
Our training is self-supervised using adaptive weights to control the attention of feature emphasis from each hierarchical level.
arXiv Detail & Related papers (2021-02-18T07:07:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.