Learning an Artificial Language for Knowledge-Sharing in Multilingual
Translation
- URL: http://arxiv.org/abs/2211.01292v1
- Date: Wed, 2 Nov 2022 17:14:42 GMT
- Title: Learning an Artificial Language for Knowledge-Sharing in Multilingual
Translation
- Authors: Danni Liu, Jan Niehues
- Abstract summary: We discretize the latent space of multilingual models by assigning encoder states to entries in a codebook.
We validate our approach on large-scale experiments with realistic data volumes and domains.
We also use the learned artificial language to analyze model behavior, and discover that using a similar bridge language increases knowledge-sharing among the remaining languages.
- Score: 15.32063273544696
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The cornerstone of multilingual neural translation is shared representations
across languages. Given the theoretically infinite representation power of
neural networks, semantically identical sentences are likely represented
differently. While representing sentences in the continuous latent space
ensures expressiveness, it introduces the risk of capturing of irrelevant
features which hinders the learning of a common representation. In this work,
we discretize the encoder output latent space of multilingual models by
assigning encoder states to entries in a codebook, which in effect represents
source sentences in a new artificial language. This discretization process not
only offers a new way to interpret the otherwise black-box model
representations, but, more importantly, gives potential for increasing
robustness in unseen testing conditions. We validate our approach on
large-scale experiments with realistic data volumes and domains. When tested in
zero-shot conditions, our approach is competitive with two strong alternatives
from the literature. We also use the learned artificial language to analyze
model behavior, and discover that using a similar bridge language increases
knowledge-sharing among the remaining languages.
Related papers
- Pixel Sentence Representation Learning [67.4775296225521]
In this work, we conceptualize the learning of sentence-level textual semantics as a visual representation learning process.
We employ visually-grounded text perturbation methods like typos and word order shuffling, resonating with human cognitive patterns, and enabling perturbation to be perceived as continuous.
Our approach is further bolstered by large-scale unsupervised topical alignment training and natural language inference supervision.
arXiv Detail & Related papers (2024-02-13T02:46:45Z) - Improving In-context Learning of Multilingual Generative Language Models with Cross-lingual Alignment [42.624862172666624]
We propose a simple yet effective cross-lingual alignment framework exploiting pairs of translation sentences.
It aligns the internal sentence representations across different languages via multilingual contrastive learning.
Experimental results show that even with less than 0.1 textperthousand of pre-training tokens, our alignment framework significantly boosts the cross-lingual abilities of generative language models.
arXiv Detail & Related papers (2023-11-14T11:24:08Z) - Transparency at the Source: Evaluating and Interpreting Language Models
With Access to the True Distribution [4.01799362940916]
We present a setup for training, evaluating and interpreting neural language models, that uses artificial, language-like data.
The data is generated using a massive probabilistic grammar, that is itself derived from a large natural language corpus.
With access to the underlying true source, our results show striking differences and outcomes in learning dynamics between different classes of words.
arXiv Detail & Related papers (2023-10-23T12:03:01Z) - Multitasking Models are Robust to Structural Failure: A Neural Model for
Bilingual Cognitive Reserve [78.3500985535601]
We find a surprising connection between multitask learning and robustness to neuron failures.
Our experiments show that bilingual language models retain higher performance under various neuron perturbations.
We provide a theoretical justification for this robustness by mathematically analyzing linear representation learning.
arXiv Detail & Related papers (2022-10-20T22:23:27Z) - Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions.
Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z) - Informative Language Representation Learning for Massively Multilingual
Neural Machine Translation [47.19129812325682]
In a multilingual neural machine translation model, an artificial language token is usually used to guide translation into the desired target language.
Recent studies show that prepending language tokens sometimes fails to navigate the multilingual neural machine translation models into right translation directions.
We propose two methods, language embedding embodiment and language-aware multi-head attention, to learn informative language representations to channel translation into right directions.
arXiv Detail & Related papers (2022-09-04T04:27:17Z) - Linking Emergent and Natural Languages via Corpus Transfer [98.98724497178247]
We propose a novel way to establish a link by corpus transfer between emergent languages and natural languages.
Our approach showcases non-trivial transfer benefits for two different tasks -- language modeling and image captioning.
We also introduce a novel metric to predict the transferability of an emergent language by translating emergent messages to natural language captions grounded on the same images.
arXiv Detail & Related papers (2022-03-24T21:24:54Z) - Read Like Humans: Autonomous, Bidirectional and Iterative Language
Modeling for Scene Text Recognition [80.446770909975]
Linguistic knowledge is of great benefit to scene text recognition.
How to effectively model linguistic rules in end-to-end deep networks remains a research challenge.
We propose an autonomous, bidirectional and iterative ABINet for scene text recognition.
arXiv Detail & Related papers (2021-03-11T06:47:45Z) - Learning Contextualised Cross-lingual Word Embeddings and Alignments for
Extremely Low-Resource Languages Using Parallel Corpora [63.5286019659504]
We propose a new approach for learning contextualised cross-lingual word embeddings based on a small parallel corpus.
Our method obtains word embeddings via an LSTM encoder-decoder model that simultaneously translates and reconstructs an input sentence.
arXiv Detail & Related papers (2020-10-27T22:24:01Z) - Understanding Cross-Lingual Syntactic Transfer in Multilingual Recurrent
Neural Networks [3.9342247746757435]
It is now established that modern neural language models can be successfully trained on multiple languages simultaneously.
But what kind of knowledge is really shared among languages within these models?
In this paper we dissect different forms of cross-lingual transfer and look for its most determining factors.
We find that exposing our LMs to a related language does not always increase grammatical knowledge in the target language, and that optimal conditions for lexical-semantic transfer may not be optimal for syntactic transfer.
arXiv Detail & Related papers (2020-03-31T09:48:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.