Searching for Structure: Investigating Emergent Communication with Large Language Models
- URL: http://arxiv.org/abs/2412.07646v3
- Date: Fri, 13 Dec 2024 12:35:21 GMT
- Title: Searching for Structure: Investigating Emergent Communication with Large Language Models
- Authors: Tom Kouwenhoven, Max Peeperkorn, Tessa Verhoef,
- Abstract summary: We simulate a classical referential game in which Large Language Models learn and use artificial languages.
Our results show that initially unstructured holistic languages are indeed shaped to have some structural properties that allow two LLM agents to communicate successfully.
- Score: 0.10923877073891446
- License:
- Abstract: Human languages have evolved to be structured through repeated language learning and use. These processes introduce biases that operate during language acquisition and shape linguistic systems toward communicative efficiency. In this paper, we investigate whether the same happens if artificial languages are optimised for implicit biases of Large Language Models (LLMs). To this end, we simulate a classical referential game in which LLMs learn and use artificial languages. Our results show that initially unstructured holistic languages are indeed shaped to have some structural properties that allow two LLM agents to communicate successfully. Similar to observations in human experiments, generational transmission increases the learnability of languages, but can at the same time result in non-humanlike degenerate vocabularies. Taken together, this work extends experimental findings, shows that LLMs can be used as tools in simulations of language evolution, and opens possibilities for future human-machine experiments in this field.
Related papers
- Can Language Models Learn Typologically Implausible Languages? [62.823015163987996]
Grammatical features across human languages show intriguing correlations often attributed to learning biases in humans.
We discuss how language models (LMs) allow us to better determine the role of domain-general learning biases in language universals.
We test LMs on an array of highly naturalistic but counterfactual versions of the English (head-initial) and Japanese (head-final) languages.
arXiv Detail & Related papers (2025-02-17T20:40:01Z) - Randomly Sampled Language Reasoning Problems Reveal Limits of LLMs [8.146860674148044]
We attempt to measure models' language understanding capacity while circumventing the risk of dataset recall.
We parameterize large families of language tasks recognized by deterministic finite automata (DFAs)
We find that, even in the strikingly simple setting of 3-state DFAs, LLMs underperform un parameterized ngram models on both language recognition and synthesis tasks.
arXiv Detail & Related papers (2025-01-06T07:57:51Z) - The Rise and Down of Babel Tower: Investigating the Evolution Process of Multilingual Code Large Language Model [59.357993924917]
We study the evolution of multilingual capabilities in large language models (LLMs) during the pre-training process.
We propose the Babel Tower Hypothesis, which describes the entire process of LLMs acquiring new language capabilities.
We propose a novel method to construct an optimized pre-training corpus for multilingual code LLMs.
arXiv Detail & Related papers (2024-12-10T08:28:57Z) - Kallini et al. (2024) do not compare impossible languages with constituency-based ones [0.0]
A central goal of linguistic theory is to find a characterization of the notion "possible human language"
Recent large language models (LLMs) in NLP applications arguably raises the possibility that LLMs might be computational devices that meet this goal.
I explain the confound and suggest some ways forward towards constructing a comparison that appropriately tests the underlying issue.
arXiv Detail & Related papers (2024-10-16T06:16:30Z) - Converging to a Lingua Franca: Evolution of Linguistic Regions and Semantics Alignment in Multilingual Large Language Models [11.423589362950812]
Large language models (LLMs) have demonstrated remarkable performance, particularly in multilingual contexts.
Recent studies suggest that LLMs can transfer skills learned in one language to others, but the internal mechanisms behind this ability remain unclear.
This paper provides insights into the internal workings of LLMs, offering a foundation for future improvements in their cross-lingual capabilities.
arXiv Detail & Related papers (2024-10-15T15:49:15Z) - The Role of Language Imbalance in Cross-lingual Generalisation: Insights from Cloned Language Experiments [57.273662221547056]
In this study, we investigate an unintuitive novel driver of cross-lingual generalisation: language imbalance.
We observe that the existence of a predominant language during training boosts the performance of less frequent languages.
As we extend our analysis to real languages, we find that infrequent languages still benefit from frequent ones, yet whether language imbalance causes cross-lingual generalisation there is not conclusive.
arXiv Detail & Related papers (2024-04-11T17:58:05Z) - LERT: A Linguistically-motivated Pre-trained Language Model [67.65651497173998]
We propose LERT, a pre-trained language model that is trained on three types of linguistic features along with the original pre-training task.
We carried out extensive experiments on ten Chinese NLU tasks, and the experimental results show that LERT could bring significant improvements.
arXiv Detail & Related papers (2022-11-10T05:09:16Z) - Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions.
Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z) - Cross-Lingual Ability of Multilingual Masked Language Models: A Study of
Language Structure [54.01613740115601]
We study three language properties: constituent order, composition and word co-occurrence.
Our main conclusion is that the contribution of constituent order and word co-occurrence is limited, while the composition is more crucial to the success of cross-linguistic transfer.
arXiv Detail & Related papers (2022-03-16T07:09:35Z) - Learning Music Helps You Read: Using Transfer to Study Linguistic
Structure in Language Models [27.91397366776451]
Training LSTMs on latent structure (MIDI music or Java code) improves test performance on natural language.
Experiments on transfer between natural languages controlling for vocabulary overlap show that zero-shot performance on a test language is highly correlated with typological similarity to the training language.
arXiv Detail & Related papers (2020-04-30T06:24:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.