Do Language Models Know the Way to Rome?
- URL: http://arxiv.org/abs/2109.07971v1
- Date: Thu, 16 Sep 2021 13:28:16 GMT
- Title: Do Language Models Know the Way to Rome?
- Authors: Bastien Li\'etard and Mostafa Abdou and Anders S{\o}gaard
- Abstract summary: We exploit the fact that in geography, ground truths are available beyond local relations.
We find that language models generally encode limited geographic information, but with larger models performing the best.
- Score: 4.344337854565144
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The global geometry of language models is important for a range of
applications, but language model probes tend to evaluate rather local
relations, for which ground truths are easily obtained. In this paper we
exploit the fact that in geography, ground truths are available beyond local
relations. In a series of experiments, we evaluate the extent to which language
model representations of city and country names are isomorphic to real-world
geography, e.g., if you tell a language model where Paris and Berlin are, does
it know the way to Rome? We find that language models generally encode limited
geographic information, but with larger models performing the best, suggesting
that geographic knowledge can be induced from higher-order co-occurrence
statistics.
Related papers
- Richer Output for Richer Countries: Uncovering Geographical Disparities in Generated Stories and Travel Recommendations [9.505918815853644]
We examine the impact of large language models for two common scenarios that require geographical knowledge: (a) travel recommendations and (b) geo-anchored story generation.
Specifically, we study four popular language models, and across about $100$K travel requests, and $200$K story generations, we observe that travel recommendations corresponding to poorer countries are less unique with fewer location references.
arXiv Detail & Related papers (2024-11-11T19:25:25Z) - Evaluation of Geographical Distortions in Language Models: A Crucial Step Towards Equitable Representations [2.825324306665133]
This study focuses on biases related to geographical knowledge.
We explore the connection between geography and language models by highlighting their tendency to misrepresent spatial information.
arXiv Detail & Related papers (2024-04-26T13:22:28Z) - On the Scaling Laws of Geographical Representation in Language Models [0.11510009152620666]
We show that geographical knowledge is observable even for tiny models, and that it scales consistently as we increase the model size.
Notably, we observe that larger language models cannot mitigate the geographical bias that is inherent to the training data.
arXiv Detail & Related papers (2024-02-29T18:04:11Z) - Geographical Erasure in Language Generation [13.219867587151986]
We study and operationalise a form of geographical erasure, wherein language models underpredict certain countries.
We discover that erasure strongly correlates with low frequencies of country mentions in the training corpus.
We mitigate erasure by finetuning using a custom objective.
arXiv Detail & Related papers (2023-10-23T10:26:14Z) - Navigation with Large Language Models: Semantic Guesswork as a Heuristic
for Planning [73.0990339667978]
Navigation in unfamiliar environments presents a major challenge for robots.
We use language models to bias exploration of novel real-world environments.
We evaluate LFG in challenging real-world environments and simulated benchmarks.
arXiv Detail & Related papers (2023-10-16T06:21:06Z) - Cross-Lingual NER for Financial Transaction Data in Low-Resource
Languages [70.25418443146435]
We propose an efficient modeling framework for cross-lingual named entity recognition in semi-structured text data.
We employ two independent datasets of SMSs in English and Arabic, each carrying semi-structured banking transaction information.
With access to only 30 labeled samples, our model can generalize the recognition of merchants, amounts, and other fields from English to Arabic.
arXiv Detail & Related papers (2023-07-16T00:45:42Z) - GeoGLUE: A GeoGraphic Language Understanding Evaluation Benchmark [56.08664336835741]
We propose a GeoGraphic Language Understanding Evaluation benchmark, named GeoGLUE.
We collect data from open-released geographic resources and introduce six natural language understanding tasks.
We pro vide evaluation experiments and analysis of general baselines, indicating the effectiveness and significance of the GeoGLUE benchmark.
arXiv Detail & Related papers (2023-05-11T03:21:56Z) - Measuring Geographic Performance Disparities of Offensive Language
Classifiers [12.545108947857802]
We ask two questions: Does language, dialect, and topical content vary across geographical regions?'' and If there are differences across the regions, do they impact model performance?''
We find that current models do not generalize across locations. Likewise, we show that while offensive language models produce false positives on African American English, model performance is not correlated with each city's minority population proportions.
arXiv Detail & Related papers (2022-09-15T15:08:18Z) - GeoMLAMA: Geo-Diverse Commonsense Probing on Multilingual Pre-Trained
Language Models [68.50584946761813]
We introduce a framework for geo-diverse commonsense probing on multilingual Language Models (mPLMs)
We benchmark 11 standard mPLMs which include variants of mBERT, XLM, mT5, and XGLM on GeoMLAMA dataset.
We find that 1) larger mPLM variants do not necessarily store geo-diverse concepts better than its smaller variant; 2) mPLMs are not intrinsically biased towards knowledge from the Western countries; and 3) a language may better probe knowledge about a non-native country than its native country.
arXiv Detail & Related papers (2022-05-24T17:54:50Z) - Towards Zero-shot Language Modeling [90.80124496312274]
We construct a neural model that is inductively biased towards learning human languages.
We infer this distribution from a sample of typologically diverse training languages.
We harness additional language-specific side information as distant supervision for held-out languages.
arXiv Detail & Related papers (2021-08-06T23:49:18Z) - Comparison of Interactive Knowledge Base Spelling Correction Models for
Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict.
This work shows a comparison of a neural model and character language models with varying amounts on target language data.
Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.