SpaBERT: A Pretrained Language Model from Geographic Data for Geo-Entity
Representation
- URL: http://arxiv.org/abs/2210.12213v1
- Date: Fri, 21 Oct 2022 19:42:32 GMT
- Title: SpaBERT: A Pretrained Language Model from Geographic Data for Geo-Entity
Representation
- Authors: Zekun Li, Jina Kim, Yao-Yi Chiang, Muhao Chen
- Abstract summary: SpaBERT provides a general-purpose geo-entity representation based on neighboring entities in geospatial data.
SpaBERT is pretrained with masked language modeling and masked entity prediction tasks.
We apply SpaBERT to two downstream tasks: geo-entity typing and geo-entity linking.
- Score: 25.52363878314735
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Named geographic entities (geo-entities for short) are the building blocks of
many geographic datasets. Characterizing geo-entities is integral to various
application domains, such as geo-intelligence and map comprehension, while a
key challenge is to capture the spatial-varying context of an entity. We
hypothesize that we shall know the characteristics of a geo-entity by its
surrounding entities, similar to knowing word meanings by their linguistic
context. Accordingly, we propose a novel spatial language model, SpaBERT, which
provides a general-purpose geo-entity representation based on neighboring
entities in geospatial data. SpaBERT extends BERT to capture linearized spatial
context, while incorporating a spatial coordinate embedding mechanism to
preserve spatial relations of entities in the 2-dimensional space. SpaBERT is
pretrained with masked language modeling and masked entity prediction tasks to
learn spatial dependencies. We apply SpaBERT to two downstream tasks:
geo-entity typing and geo-entity linking. Compared with the existing language
models that do not use spatial context, SpaBERT shows significant performance
improvement on both tasks. We also analyze the entity representation from
SpaBERT in various settings and the effect of spatial coordinate embedding.
Related papers
- GeoReasoner: Reasoning On Geospatially Grounded Context For Natural Language Understanding [0.32885740436059047]
GeoReasoner is a language model capable of reasoning on geospatially grounded natural language.
It first leverages Large Language Models to generate a comprehensive location description based on linguistic inferences and distance information.
It also encodes direction and distance information into spatial embedding via treating them as pseudo-sentences.
arXiv Detail & Related papers (2024-08-21T06:35:21Z) - A systematic review of geospatial location embedding approaches in large
language models: A path to spatial AI systems [0.0]
Geospatial Location Embedding (GLE) helps a Large Language Model (LLM) assimilate and analyze spatial data.
GLEs signal the need for a Spatial Foundation/Language Model (SLM) that embeds spatial knowing within the model architecture.
arXiv Detail & Related papers (2024-01-12T12:43:33Z) - Towards Natural Language-Guided Drones: GeoText-1652 Benchmark with Spatial Relation Matching [60.645802236700035]
Navigating drones through natural language commands remains challenging due to the dearth of accessible multi-modal datasets.
We introduce GeoText-1652, a new natural language-guided geo-localization benchmark.
This dataset is systematically constructed through an interactive human-computer process.
arXiv Detail & Related papers (2023-11-21T17:52:30Z) - GeoLM: Empowering Language Models for Geospatially Grounded Language
Understanding [45.36562604939258]
This paper introduces GeoLM, a language model that enhances the understanding of geo-entities in natural language.
We demonstrate that GeoLM exhibits promising capabilities in supporting toponym recognition, toponym linking, relation extraction, and geo-entity typing.
arXiv Detail & Related papers (2023-10-23T01:20:01Z) - Geo-Encoder: A Chunk-Argument Bi-Encoder Framework for Chinese
Geographic Re-Ranking [61.60169764507917]
Chinese geographic re-ranking task aims to find the most relevant addresses among retrieved candidates.
We propose an innovative framework, namely Geo-Encoder, to more effectively integrate Chinese geographical semantics into re-ranking pipelines.
arXiv Detail & Related papers (2023-09-04T13:44:50Z) - GeoGLUE: A GeoGraphic Language Understanding Evaluation Benchmark [56.08664336835741]
We propose a GeoGraphic Language Understanding Evaluation benchmark, named GeoGLUE.
We collect data from open-released geographic resources and introduce six natural language understanding tasks.
We pro vide evaluation experiments and analysis of general baselines, indicating the effectiveness and significance of the GeoGLUE benchmark.
arXiv Detail & Related papers (2023-05-11T03:21:56Z) - SIRI: Spatial Relation Induced Network For Spatial Description
Resolution [64.38872296406211]
We propose a novel relationship induced (SIRI) network for language-guided localization.
We show that our method is around 24% better than the state-of-the-art method in terms of accuracy, measured by an 80-pixel radius.
Our method also generalizes well on our proposed extended dataset collected using the same settings as Touchdown.
arXiv Detail & Related papers (2020-10-27T14:04:05Z) - Understanding Spatial Relations through Multiple Modalities [78.07328342973611]
spatial relations between objects can either be explicit -- expressed as spatial prepositions, or implicit -- expressed by spatial verbs such as moving, walking, shifting, etc.
We introduce the task of inferring implicit and explicit spatial relations between two entities in an image.
We design a model that uses both textual and visual information to predict the spatial relations, making use of both positional and size information of objects and image embeddings.
arXiv Detail & Related papers (2020-07-19T01:35:08Z) - SE-KGE: A Location-Aware Knowledge Graph Embedding Model for Geographic
Question Answering and Spatial Semantic Lifting [9.949690056661218]
We propose a location-aware KG embedding model called SE-KGE.
It encodes spatial information such as point coordinates or bounding boxes of geographic entities into the KG embedding space.
We also construct a geographic knowledge graph as well as a set of geographic query-answer pairs called DBGeo to evaluate the performance of SE-KGE.
arXiv Detail & Related papers (2020-04-25T17:46:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.