GeoVectors: A Linked Open Corpus of OpenStreetMap Embeddings on World
Scale
- URL: http://arxiv.org/abs/2108.13092v1
- Date: Mon, 30 Aug 2021 10:00:34 GMT
- Title: GeoVectors: A Linked Open Corpus of OpenStreetMap Embeddings on World
Scale
- Authors: Nicolas Tempelmeier, Simon Gottschalk, Elena Demidova
- Abstract summary: This paper presents Geos - a unique, comprehensive world-scale linked corpus of OSM entity embeddings.
The Geos corpus captures semantic and geographic dimensions of OSM entities and makes them accessible to machine learning algorithms.
We provide a SPARQL endpoint - a semantic interface that offers direct access to semantic and latent representations of geographic entities in OSM.
- Score: 1.933681537640272
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: OpenStreetMap (OSM) is currently the richest publicly available information
source on geographic entities (e.g., buildings and roads) worldwide. However,
using OSM entities in machine learning models and other applications is
challenging due to the large scale of OSM, the extreme heterogeneity of entity
annotations, and a lack of a well-defined ontology to describe entity semantics
and properties. This paper presents GeoVectors - a unique, comprehensive
world-scale linked open corpus of OSM entity embeddings covering the entire OSM
dataset and providing latent representations of over 980 million geographic
entities in 180 countries. The GeoVectors corpus captures semantic and
geographic dimensions of OSM entities and makes these entities directly
accessible to machine learning algorithms and semantic applications. We create
a semantic description of the GeoVectors corpus, including identity links to
the Wikidata and DBpedia knowledge graphs to supply context information.
Furthermore, we provide a SPARQL endpoint - a semantic interface that offers
direct access to the semantic and latent representations of geographic entities
in OSM.
Related papers
- Where on Earth Do Users Say They Are?: Geo-Entity Linking for Noisy Multilingual User Input [2.516307239032451]
We present a method which represents real-world locations as averaged embeddings from labeled user-input location names.
We show that our approach improves geo-entity linking on a global and multilingual social media dataset.
arXiv Detail & Related papers (2024-04-29T15:18:33Z) - GeoLM: Empowering Language Models for Geospatially Grounded Language
Understanding [45.36562604939258]
This paper introduces GeoLM, a language model that enhances the understanding of geo-entities in natural language.
We demonstrate that GeoLM exhibits promising capabilities in supporting toponym recognition, toponym linking, relation extraction, and geo-entity typing.
arXiv Detail & Related papers (2023-10-23T01:20:01Z) - GeoLLM: Extracting Geospatial Knowledge from Large Language Models [49.20315582673223]
We present GeoLLM, a novel method that can effectively extract geospatial knowledge from large language models.
We demonstrate the utility of our approach across multiple tasks of central interest to the international community, including the measurement of population density and economic livelihoods.
Our experiments reveal that LLMs are remarkably sample-efficient, rich in geospatial information, and robust across the globe.
arXiv Detail & Related papers (2023-10-10T00:03:23Z) - GeoCLIP: Clip-Inspired Alignment between Locations and Images for
Effective Worldwide Geo-localization [61.10806364001535]
Worldwide Geo-localization aims to pinpoint the precise location of images taken anywhere on Earth.
Existing approaches divide the globe into discrete geographic cells, transforming the problem into a classification task.
We propose GeoCLIP, a novel CLIP-inspired Image-to-GPS retrieval approach that enforces alignment between the image and its corresponding GPS locations.
arXiv Detail & Related papers (2023-09-27T20:54:56Z) - GeoGLUE: A GeoGraphic Language Understanding Evaluation Benchmark [56.08664336835741]
We propose a GeoGraphic Language Understanding Evaluation benchmark, named GeoGLUE.
We collect data from open-released geographic resources and introduce six natural language understanding tasks.
We pro vide evaluation experiments and analysis of general baselines, indicating the effectiveness and significance of the GeoGLUE benchmark.
arXiv Detail & Related papers (2023-05-11T03:21:56Z) - SpaBERT: A Pretrained Language Model from Geographic Data for Geo-Entity
Representation [25.52363878314735]
SpaBERT provides a general-purpose geo-entity representation based on neighboring entities in geospatial data.
SpaBERT is pretrained with masked language modeling and masked entity prediction tasks.
We apply SpaBERT to two downstream tasks: geo-entity typing and geo-entity linking.
arXiv Detail & Related papers (2022-10-21T19:42:32Z) - MobIE: A German Dataset for Named Entity Recognition, Entity Linking and
Relation Extraction in the Mobility Domain [76.21775236904185]
dataset consists of 3,232 social media texts and traffic reports with 91K tokens, and contains 20.5K annotated entities.
A subset of the dataset is human-annotated with seven mobility-related, n-ary relation types.
To the best of our knowledge, this is the first German-language dataset that combines annotations for NER, EL and RE.
arXiv Detail & Related papers (2021-08-16T08:21:50Z) - Towards Neural Schema Alignment for OpenStreetMap and Knowledge Graphs [0.966840768820136]
OpenStreetMap (OSM) is one of the richest openly available sources of volunteered geographic information.
Knowledge graphs can potentially provide valuable semantic information to enrich OSM entities.
This paper tackles the alignment of OSM tags with the corresponding knowledge graph classes holistically by jointly considering the schema and instance layers.
We propose a novel neural architecture that capitalizes upon a shared latent space for tag-to-class alignment created using linked entities in OSM and knowledge graphs.
arXiv Detail & Related papers (2021-07-28T10:40:35Z) - Linking OpenStreetMap with Knowledge Graphs -- Link Discovery for
Schema-Agnostic Volunteered Geographic Information [3.04585143845864]
We propose OSM2KG - a novel link discovery approach to predict identity links between OSM nodes and geographic entities in a knowledge graph.
The core of the OSM2KG approach is a novel latent, compact representation of OSM nodes that captures semantic node similarity in an embedding.
Our experiments conducted on several OSM datasets, as well as the Wikidata and DBpedia knowledge graphs, demonstrate that OSM2KG can reliably discover identity links.
arXiv Detail & Related papers (2020-11-06T19:03:41Z) - OpenStreetMap: Challenges and Opportunities in Machine Learning and
Remote Sensing [66.23463054467653]
We present a review of recent methods based on machine learning to improve and use OpenStreetMap data.
We believe that OSM can change the way we interpret remote sensing data and that the synergy with machine learning can scale participatory map making.
arXiv Detail & Related papers (2020-07-13T09:58:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.