Linking Emergent and Natural Languages via Corpus Transfer
- URL: http://arxiv.org/abs/2203.13344v1
- Date: Thu, 24 Mar 2022 21:24:54 GMT
- Title: Linking Emergent and Natural Languages via Corpus Transfer
- Authors: Shunyu Yao, Mo Yu, Yang Zhang, Karthik R Narasimhan, Joshua B.
Tenenbaum, Chuang Gan
- Abstract summary: We propose a novel way to establish a link by corpus transfer between emergent languages and natural languages.
Our approach showcases non-trivial transfer benefits for two different tasks -- language modeling and image captioning.
We also introduce a novel metric to predict the transferability of an emergent language by translating emergent messages to natural language captions grounded on the same images.
- Score: 98.98724497178247
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The study of language emergence aims to understand how human languages are
shaped by perceptual grounding and communicative intent. Computational
approaches to emergent communication (EC) predominantly consider referential
games in limited domains and analyze the learned protocol within the game
framework. As a result, it remains unclear how the emergent languages from
these settings connect to natural languages or provide benefits in real-world
language processing tasks, where statistical models trained on large text
corpora dominate. In this work, we propose a novel way to establish such a link
by corpus transfer, i.e. pretraining on a corpus of emergent language for
downstream natural language tasks, which is in contrast to prior work that
directly transfers speaker and listener parameters. Our approach showcases
non-trivial transfer benefits for two different tasks -- language modeling and
image captioning. For example, in a low-resource setup (modeling 2 million
natural language tokens), pre-training on an emergent language corpus with just
2 million tokens reduces model perplexity by $24.6\%$ on average across ten
natural languages. We also introduce a novel metric to predict the
transferability of an emergent language by translating emergent messages to
natural language captions grounded on the same images. We find that our
translation-based metric highly correlates with the downstream performance on
modeling natural languages (for instance $\rho=0.83$ on Hebrew), while
topographic similarity, a popular metric in previous work, shows surprisingly
low correlation ($\rho=0.003$), hinting that simple properties like attribute
disentanglement from synthetic domains might not capture the full complexities
of natural language. Our findings also indicate potential benefits of moving
language emergence forward with natural language resources and models.
Related papers
- Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Languages You Know Influence Those You Learn: Impact of Language
Characteristics on Multi-Lingual Text-to-Text Transfer [4.554080966463776]
Multi-lingual language models (LM) have been remarkably successful in enabling natural language tasks in low-resource languages.
We try to better understand how such models, specifically mT5, transfer *any* linguistic and semantic knowledge across languages.
A key finding of this work is that similarity of syntax, morphology and phonology are good predictors of cross-lingual transfer.
arXiv Detail & Related papers (2022-12-04T07:22:21Z) - Learning an Artificial Language for Knowledge-Sharing in Multilingual
Translation [15.32063273544696]
We discretize the latent space of multilingual models by assigning encoder states to entries in a codebook.
We validate our approach on large-scale experiments with realistic data volumes and domains.
We also use the learned artificial language to analyze model behavior, and discover that using a similar bridge language increases knowledge-sharing among the remaining languages.
arXiv Detail & Related papers (2022-11-02T17:14:42Z) - Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions.
Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z) - Towards Zero-shot Language Modeling [90.80124496312274]
We construct a neural model that is inductively biased towards learning human languages.
We infer this distribution from a sample of typologically diverse training languages.
We harness additional language-specific side information as distant supervision for held-out languages.
arXiv Detail & Related papers (2021-08-06T23:49:18Z) - Emergent Communication Pretraining for Few-Shot Machine Translation [66.48990742411033]
We pretrain neural networks via emergent communication from referential games.
Our key assumption is that grounding communication on images---as a crude approximation of real-world environments---inductively biases the model towards learning natural languages.
arXiv Detail & Related papers (2020-11-02T10:57:53Z) - Vokenization: Improving Language Understanding with Contextualized,
Visual-Grounded Supervision [110.66085917826648]
We develop a technique that extrapolates multimodal alignments to language-only data by contextually mapping language tokens to their related images.
"vokenization" is trained on relatively small image captioning datasets and we then apply it to generate vokens for large language corpora.
Trained with these contextually generated vokens, our visually-supervised language models show consistent improvements over self-supervised alternatives on multiple pure-language tasks.
arXiv Detail & Related papers (2020-10-14T02:11:51Z) - Learning Music Helps You Read: Using Transfer to Study Linguistic
Structure in Language Models [27.91397366776451]
Training LSTMs on latent structure (MIDI music or Java code) improves test performance on natural language.
Experiments on transfer between natural languages controlling for vocabulary overlap show that zero-shot performance on a test language is highly correlated with typological similarity to the training language.
arXiv Detail & Related papers (2020-04-30T06:24:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.