Learning thresholds lead to stable language coexistence
- URL: http://arxiv.org/abs/2406.14522v2
- Date: Mon, 25 Nov 2024 12:58:08 GMT
- Title: Learning thresholds lead to stable language coexistence
- Authors: Mikhail V. Tamm, Els Heinsalu, Stefano Scialla, Marco Patriarca,
- Abstract summary: We introduce a language competition model that incorporates the effects of memory and learning in the language shift dynamics.
On a coarse grained time scale, the effects of memory and learning can be expressed as thresholds on the speakers fractions of the competing languages.
- Score: 0.0
- License:
- Abstract: We introduce a language competition model that is based on the Abrams-Strogatz model and incorporates the effects of memory and learning in the language shift dynamics. On a coarse grained time scale, the effects of memory and learning can be expressed as thresholds on the speakers fractions of the competing languages. In its simplest form, the resulting model is exactly solvable. Besides the consensus on one of the two languages, the model describes additional equilibrium states that are not present in the Abrams-Strogatz model: a stable dynamical coexistence of the two languages and a frozen state coinciding with the initial state. We show numerically that these results are preserved for threshold functions of a more general shape. The comparison of the model predictions with historical datasets demonstrates that while the Abrams-Strogatz model fails to describe some relevant language competition situations, the proposed model provides a good fitting.
Related papers
- mFACE: Multilingual Summarization with Factual Consistency Evaluation [79.60172087719356]
Abstractive summarization has enjoyed renewed interest in recent years, thanks to pre-trained language models and the availability of large-scale datasets.
Despite promising results, current models still suffer from generating factually inconsistent summaries.
We leverage factual consistency evaluation models to improve multilingual summarization.
arXiv Detail & Related papers (2022-12-20T19:52:41Z) - Training Trajectories of Language Models Across Scales [99.38721327771208]
Scaling up language models has led to unprecedented performance gains.
How do language models of different sizes learn during pre-training?
Why do larger language models demonstrate more desirable behaviors?
arXiv Detail & Related papers (2022-12-19T19:16:29Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - Quark: Controllable Text Generation with Reinforced Unlearning [68.07749519374089]
Large-scale language models often learn behaviors that are misaligned with user expectations.
We introduce Quantized Reward Konditioning (Quark), an algorithm for optimizing a reward function that quantifies an (un)wanted property.
For unlearning toxicity, negative sentiment, and repetition, our experiments show that Quark outperforms both strong baselines and state-of-the-art reinforcement learning methods.
arXiv Detail & Related papers (2022-05-26T21:11:51Z) - An Application of Pseudo-Log-Likelihoods to Natural Language Scoring [5.382454613390483]
A language model with relatively few parameters and training steps can outperform it on a recent large data set.
We produce some absolute state-of-the-art results for common sense reasoning in binary choice tasks.
We argue that robustness of the smaller model ought to be understood in terms of compositionality.
arXiv Detail & Related papers (2022-01-23T22:00:54Z) - Comparison of Interactive Knowledge Base Spelling Correction Models for
Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict.
This work shows a comparison of a neural model and character language models with varying amounts on target language data.
Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z) - Multi-timescale Representation Learning in LSTM Language Models [69.98840820213937]
Language models must capture statistical dependencies between words at timescales ranging from very short to very long.
We derived a theory for how the memory gating mechanism in long short-term memory language models can capture power law decay.
Experiments showed that LSTM language models trained on natural English text learn to approximate this theoretical distribution.
arXiv Detail & Related papers (2020-09-27T02:13:38Z) - Labeling Explicit Discourse Relations using Pre-trained Language Models [0.0]
State-of-the-art models achieve slightly above 45% of F-score by using hand-crafted features.
We find that the pre-trained language models, when finetuned, are powerful enough to replace the linguistic features.
This is the first time when a model outperforms the knowledge intensive models without employing any linguistic features.
arXiv Detail & Related papers (2020-06-21T17:18:01Z) - Overestimation of Syntactic Representationin Neural Language Models [16.765097098482286]
One popular method for determining a model's ability to induce syntactic structure trains a model on strings generated according to a template then tests the model's ability to distinguish such strings from superficially similar ones with different syntax.
We illustrate a fundamental problem with this approach by reproducing positive results from a recent paper with two non-syntactic baseline language models.
arXiv Detail & Related papers (2020-04-10T15:13:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.