HeGeL: A Novel Dataset for Geo-Location from Hebrew Text
- URL: http://arxiv.org/abs/2307.00509v1
- Date: Sun, 2 Jul 2023 08:09:10 GMT
- Title: HeGeL: A Novel Dataset for Geo-Location from Hebrew Text
- Authors: Tzuf Paz-Argaman, Tal Bauman, Itai Mondshine, Itzhak Omer, Sagi
Dalyot, Reut Tsarfaty
- Abstract summary: We present the Hebrew Geo-Location (HeGeL) corpus, designed to collect literal place descriptions and analyze lingual geospatial reasoning.
We crowdsourced 5,649 literal Hebrew place descriptions of various place types in three cities in Israel.
- Score: 5.109028790494419
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The task of textual geolocation - retrieving the coordinates of a place based
on a free-form language description - calls for not only grounding but also
natural language understanding and geospatial reasoning. Even though there are
quite a few datasets in English used for geolocation, they are currently based
on open-source data (Wikipedia and Twitter), where the location of the
described place is mostly implicit, such that the location retrieval resolution
is limited. Furthermore, there are no datasets available for addressing the
problem of textual geolocation in morphologically rich and resource-poor
languages, such as Hebrew. In this paper, we present the Hebrew Geo-Location
(HeGeL) corpus, designed to collect literal place descriptions and analyze
lingual geospatial reasoning. We crowdsourced 5,649 literal Hebrew place
descriptions of various place types in three cities in Israel. Qualitative and
empirical analysis show that the data exhibits abundant use of geospatial
reasoning and requires a novel environmental representation.
Related papers
- GeoReasoner: Reasoning On Geospatially Grounded Context For Natural Language Understanding [0.32885740436059047]
GeoReasoner is a language model capable of reasoning on geospatially grounded natural language.
It first leverages Large Language Models to generate a comprehensive location description based on linguistic inferences and distance information.
It also encodes direction and distance information into spatial embedding via treating them as pseudo-sentences.
arXiv Detail & Related papers (2024-08-21T06:35:21Z) - GeoLM: Empowering Language Models for Geospatially Grounded Language
Understanding [45.36562604939258]
This paper introduces GeoLM, a language model that enhances the understanding of geo-entities in natural language.
We demonstrate that GeoLM exhibits promising capabilities in supporting toponym recognition, toponym linking, relation extraction, and geo-entity typing.
arXiv Detail & Related papers (2023-10-23T01:20:01Z) - GeoCLIP: Clip-Inspired Alignment between Locations and Images for
Effective Worldwide Geo-localization [61.10806364001535]
Worldwide Geo-localization aims to pinpoint the precise location of images taken anywhere on Earth.
Existing approaches divide the globe into discrete geographic cells, transforming the problem into a classification task.
We propose GeoCLIP, a novel CLIP-inspired Image-to-GPS retrieval approach that enforces alignment between the image and its corresponding GPS locations.
arXiv Detail & Related papers (2023-09-27T20:54:56Z) - Geo-Encoder: A Chunk-Argument Bi-Encoder Framework for Chinese
Geographic Re-Ranking [61.60169764507917]
Chinese geographic re-ranking task aims to find the most relevant addresses among retrieved candidates.
We propose an innovative framework, namely Geo-Encoder, to more effectively integrate Chinese geographical semantics into re-ranking pipelines.
arXiv Detail & Related papers (2023-09-04T13:44:50Z) - GeoGLUE: A GeoGraphic Language Understanding Evaluation Benchmark [56.08664336835741]
We propose a GeoGraphic Language Understanding Evaluation benchmark, named GeoGLUE.
We collect data from open-released geographic resources and introduce six natural language understanding tasks.
We pro vide evaluation experiments and analysis of general baselines, indicating the effectiveness and significance of the GeoGLUE benchmark.
arXiv Detail & Related papers (2023-05-11T03:21:56Z) - G^3: Geolocation via Guidebook Grounding [92.46774241823562]
We study explicit knowledge from human-written guidebooks that describe the salient and class-discriminative visual features humans use for geolocation.
We propose the task of Geolocation via Guidebook Grounding that uses a dataset of StreetView images from a diverse set of locations.
Our approach substantially outperforms a state-of-the-art image-only geolocation method, with an improvement of over 5% in Top-1 accuracy.
arXiv Detail & Related papers (2022-11-28T16:34:40Z) - Geographic Adaptation of Pretrained Language Models [29.81557992080902]
We introduce geoadaptation, an intermediate training step that couples language modeling with geolocation prediction in a multi-task learning setup.
We show that the effectiveness of geoadaptation stems from its ability to geographically retrofit the representation space of the pretrained language models.
arXiv Detail & Related papers (2022-03-16T11:55:00Z) - Regressing Location on Text for Probabilistic Geocoding [0.0]
We present an end-to-end probabilistic model for geocoding text data.
We compare the model-based solution, called ELECTRo-map, to the current state-of-the-art open source system for geocoding texts for event data.
arXiv Detail & Related papers (2021-06-30T20:04:55Z) - From Topic Networks to Distributed Cognitive Maps: Zipfian Topic
Universes in the Area of Volunteered Geographic Information [59.0235296929395]
We investigate how language encodes and networks geographic information on the aboutness level of texts.
Our study shows a Zipfian organization of the thematic universe in which geographical places are located in online communication.
Places, whether close to each other or not, are located in neighboring places that span similarworks in the topic universe.
arXiv Detail & Related papers (2020-02-04T18:31:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.