MGeo: Multi-Modal Geographic Pre-Training Method
- URL: http://arxiv.org/abs/2301.04283v2
- Date: Wed, 24 May 2023 04:20:27 GMT
- Title: MGeo: Multi-Modal Geographic Pre-Training Method
- Authors: Ruixue Ding, Boli Chen, Pengjun Xie, Fei Huang, Xin Li, Qiang Zhang,
Yao Xu
- Abstract summary: We propose a novel query-POI matching method Multi-modal Geographic language model (MGeo)
MGeo represents GC as a new modality and is able to fully extract multi-modal correlations for accurate query-POI matching.
Our proposed multi-modal pre-training method can significantly improve the query-POI matching capability of generic PTMs.
- Score: 49.78466122982627
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As a core task in location-based services (LBS) (e.g., navigation maps),
query and point of interest (POI) matching connects users' intent with
real-world geographic information. Recently, pre-trained models (PTMs) have
made advancements in many natural language processing (NLP) tasks. Generic
text-based PTMs do not have enough geographic knowledge for query-POI matching.
To overcome this limitation, related literature attempts to employ
domain-adaptive pre-training based on geo-related corpus. However, a query
generally contains mentions of multiple geographic objects, such as nearby
roads and regions of interest (ROIs). The geographic context (GC), i.e., these
diverse geographic objects and their relationships, is therefore pivotal to
retrieving the most relevant POI. Single-modal PTMs can barely make use of the
important GC and therefore have limited performance. In this work, we propose a
novel query-POI matching method Multi-modal Geographic language model (MGeo),
which comprises a geographic encoder and a multi-modal interaction module. MGeo
represents GC as a new modality and is able to fully extract multi-modal
correlations for accurate query-POI matching. Besides, there is no publicly
available benchmark for this topic. In order to facilitate further research, we
build a new open-source large-scale benchmark Geographic TExtual Similarity
(GeoTES). The POIs come from an open-source geographic information system
(GIS). The queries are manually generated by annotators to prevent privacy
issues. Compared with several strong baselines, the extensive experiment
results and detailed ablation analyses on GeoTES demonstrate that our proposed
multi-modal pre-training method can significantly improve the query-POI
matching capability of generic PTMs, even when the queries' GC is not provided.
Our code and dataset are publicly available at
https://github.com/PhantomGrapes/MGeo.
Related papers
- Geo-FuB: A Method for Constructing an Operator-Function Knowledge Base for Geospatial Code Generation Tasks Using Large Language Models [0.5242869847419834]
This study introduces a framework to construct such a knowledge base, leveraging geospatial script semantics.
An example knowledge base, Geo-FuB, built from 154,075 Google Earth Engine scripts, is available on GitHub.
arXiv Detail & Related papers (2024-10-28T12:50:27Z) - Swarm Intelligence in Geo-Localization: A Multi-Agent Large Vision-Language Model Collaborative Framework [51.26566634946208]
We introduce smileGeo, a novel visual geo-localization framework.
By inter-agent communication, smileGeo integrates the inherent knowledge of these agents with additional retrieved information.
Results show that our approach significantly outperforms current state-of-the-art methods.
arXiv Detail & Related papers (2024-08-21T03:31:30Z) - GeoLLM: Extracting Geospatial Knowledge from Large Language Models [49.20315582673223]
We present GeoLLM, a novel method that can effectively extract geospatial knowledge from large language models.
We demonstrate the utility of our approach across multiple tasks of central interest to the international community, including the measurement of population density and economic livelihoods.
Our experiments reveal that LLMs are remarkably sample-efficient, rich in geospatial information, and robust across the globe.
arXiv Detail & Related papers (2023-10-10T00:03:23Z) - GeoCLIP: Clip-Inspired Alignment between Locations and Images for
Effective Worldwide Geo-localization [61.10806364001535]
Worldwide Geo-localization aims to pinpoint the precise location of images taken anywhere on Earth.
Existing approaches divide the globe into discrete geographic cells, transforming the problem into a classification task.
We propose GeoCLIP, a novel CLIP-inspired Image-to-GPS retrieval approach that enforces alignment between the image and its corresponding GPS locations.
arXiv Detail & Related papers (2023-09-27T20:54:56Z) - Geo-Encoder: A Chunk-Argument Bi-Encoder Framework for Chinese
Geographic Re-Ranking [61.60169764507917]
Chinese geographic re-ranking task aims to find the most relevant addresses among retrieved candidates.
We propose an innovative framework, namely Geo-Encoder, to more effectively integrate Chinese geographical semantics into re-ranking pipelines.
arXiv Detail & Related papers (2023-09-04T13:44:50Z) - GeoGPT: Understanding and Processing Geospatial Tasks through An
Autonomous GPT [6.618846295332767]
Decision-makers in GIS need to combine a series of spatial algorithms and operations to solve geospatial tasks.
We develop a new framework called GeoGPT that can conduct geospatial data collection, processing, and analysis in an autonomous manner.
arXiv Detail & Related papers (2023-07-16T03:03:59Z) - GeoGLUE: A GeoGraphic Language Understanding Evaluation Benchmark [56.08664336835741]
We propose a GeoGraphic Language Understanding Evaluation benchmark, named GeoGLUE.
We collect data from open-released geographic resources and introduce six natural language understanding tasks.
We pro vide evaluation experiments and analysis of general baselines, indicating the effectiveness and significance of the GeoGLUE benchmark.
arXiv Detail & Related papers (2023-05-11T03:21:56Z) - Semantically-Enriched Search Engine for Geoportals: A Case Study with
ArcGIS Online [7.005838154484841]
We propose a semantically-enriched search engine for geoportals using Lucene-based techniques.
A benchmark dataset is constructed to evaluate the proposed framework.
Our evaluation results show that the proposed semantic query expansion framework is very effective in capturing a user's search intention.
arXiv Detail & Related papers (2020-03-14T06:16:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.