Related papers: Do Language Models Know the Way to Rome?

Do Language Models Know the Way to Rome?

URL: http://arxiv.org/abs/2109.07971v1
Date: Thu, 16 Sep 2021 13:28:16 GMT
Title: Do Language Models Know the Way to Rome?
Authors: Bastien Li\'etard and Mostafa Abdou and Anders S{\o}gaard
Abstract summary: We exploit the fact that in geography, ground truths are available beyond local relations. We find that language models generally encode limited geographic information, but with larger models performing the best.
Score: 4.344337854565144
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The global geometry of language models is important for a range of applications, but language model probes tend to evaluate rather local relations, for which ground truths are easily obtained. In this paper we exploit the fact that in geography, ground truths are available beyond local relations. In a series of experiments, we evaluate the extent to which language model representations of city and country names are isomorphic to real-world geography, e.g., if you tell a language model where Paris and Berlin are, does it know the way to Rome? We find that language models generally encode limited geographic information, but with larger models performing the best, suggesting that geographic knowledge can be induced from higher-order co-occurrence statistics.

Related papers

A Distributional Perspective on Word Learning in Neural Language Models [57.41607944290822]
There are no widely agreed-upon metrics for word learning in language models. We argue that distributional signatures studied in prior work fail to capture key distributional information. We obtain learning trajectories for a selection of small language models we train from scratch.
arXiv Detail & Related papers (2025-02-09T13:15:59Z)
Richer Output for Richer Countries: Uncovering Geographical Disparities in Generated Stories and Travel Recommendations [9.505918815853644]
We examine the impact of large language models for two common scenarios that require geographical knowledge: (a) travel recommendations and (b) geo-anchored story generation. Specifically, we study four popular language models, and across about $100$K travel requests, and $200$K story generations, we observe that travel recommendations corresponding to poorer countries are less unique with fewer location references.
arXiv Detail & Related papers (2024-11-11T19:25:25Z)
Evaluation of Geographical Distortions in Language Models: A Crucial Step Towards Equitable Representations [2.825324306665133]
This study focuses on biases related to geographical knowledge. We explore the connection between geography and language models by highlighting their tendency to misrepresent spatial information.
arXiv Detail & Related papers (2024-04-26T13:22:28Z)
On the Scaling Laws of Geographical Representation in Language Models [0.11510009152620666]
We show that geographical knowledge is observable even for tiny models, and that it scales consistently as we increase the model size. Notably, we observe that larger language models cannot mitigate the geographical bias that is inherent to the training data.
arXiv Detail & Related papers (2024-02-29T18:04:11Z)
Geographical Erasure in Language Generation [13.219867587151986]
We study and operationalise a form of geographical erasure, wherein language models underpredict certain countries. We discover that erasure strongly correlates with low frequencies of country mentions in the training corpus. We mitigate erasure by finetuning using a custom objective.
arXiv Detail & Related papers (2023-10-23T10:26:14Z)
Navigation with Large Language Models: Semantic Guesswork as a Heuristic for Planning [73.0990339667978]
Navigation in unfamiliar environments presents a major challenge for robots. We use language models to bias exploration of novel real-world environments. We evaluate LFG in challenging real-world environments and simulated benchmarks.
arXiv Detail & Related papers (2023-10-16T06:21:06Z)
Cross-Lingual NER for Financial Transaction Data in Low-Resource Languages [70.25418443146435]
We propose an efficient modeling framework for cross-lingual named entity recognition in semi-structured text data. We employ two independent datasets of SMSs in English and Arabic, each carrying semi-structured banking transaction information. With access to only 30 labeled samples, our model can generalize the recognition of merchants, amounts, and other fields from English to Arabic.
arXiv Detail & Related papers (2023-07-16T00:45:42Z)
GeoGLUE: A GeoGraphic Language Understanding Evaluation Benchmark [56.08664336835741]
We propose a GeoGraphic Language Understanding Evaluation benchmark, named GeoGLUE. We collect data from open-released geographic resources and introduce six natural language understanding tasks. We pro vide evaluation experiments and analysis of general baselines, indicating the effectiveness and significance of the GeoGLUE benchmark.
arXiv Detail & Related papers (2023-05-11T03:21:56Z)
Measuring Geographic Performance Disparities of Offensive Language Classifiers [12.545108947857802]
We ask two questions: Does language, dialect, and topical content vary across geographical regions?'' and If there are differences across the regions, do they impact model performance?'' We find that current models do not generalize across locations. Likewise, we show that while offensive language models produce false positives on African American English, model performance is not correlated with each city's minority population proportions.
arXiv Detail & Related papers (2022-09-15T15:08:18Z)
GeoMLAMA: Geo-Diverse Commonsense Probing on Multilingual Pre-Trained Language Models [68.50584946761813]
We introduce a framework for geo-diverse commonsense probing on multilingual Language Models (mPLMs) We benchmark 11 standard mPLMs which include variants of mBERT, XLM, mT5, and XGLM on GeoMLAMA dataset. We find that 1) larger mPLM variants do not necessarily store geo-diverse concepts better than its smaller variant; 2) mPLMs are not intrinsically biased towards knowledge from the Western countries; and 3) a language may better probe knowledge about a non-native country than its native country.
arXiv Detail & Related papers (2022-05-24T17:54:50Z)
Towards Zero-shot Language Modeling [90.80124496312274]
We construct a neural model that is inductively biased towards learning human languages. We infer this distribution from a sample of typologically diverse training languages. We harness additional language-specific side information as distant supervision for held-out languages.
arXiv Detail & Related papers (2021-08-06T23:49:18Z)
Comparison of Interactive Knowledge Base Spelling Correction Models for Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict. This work shows a comparison of a neural model and character language models with varying amounts on target language data. Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.