Informative Language Representation Learning for Massively Multilingual
Neural Machine Translation
- URL: http://arxiv.org/abs/2209.01530v1
- Date: Sun, 4 Sep 2022 04:27:17 GMT
- Title: Informative Language Representation Learning for Massively Multilingual
Neural Machine Translation
- Authors: Renren Jin and Deyi Xiong
- Abstract summary: In a multilingual neural machine translation model, an artificial language token is usually used to guide translation into the desired target language.
Recent studies show that prepending language tokens sometimes fails to navigate the multilingual neural machine translation models into right translation directions.
We propose two methods, language embedding embodiment and language-aware multi-head attention, to learn informative language representations to channel translation into right directions.
- Score: 47.19129812325682
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In a multilingual neural machine translation model that fully shares
parameters across all languages, an artificial language token is usually used
to guide translation into the desired target language. However, recent studies
show that prepending language tokens sometimes fails to navigate the
multilingual neural machine translation models into right translation
directions, especially on zero-shot translation. To mitigate this issue, we
propose two methods, language embedding embodiment and language-aware
multi-head attention, to learn informative language representations to channel
translation into right directions. The former embodies language embeddings into
different critical switching points along the information flow from the source
to the target, aiming at amplifying translation direction guiding signals. The
latter exploits a matrix, instead of a vector, to represent a language in the
continuous space. The matrix is chunked into multiple heads so as to learn
language representations in multiple subspaces. Experiment results on two
datasets for massively multilingual neural machine translation demonstrate that
language-aware multi-head attention benefits both supervised and zero-shot
translation and significantly alleviates the off-target translation issue.
Further linguistic typology prediction experiments show that matrix-based
language representations learned by our methods are capable of capturing rich
linguistic typology features.
Related papers
- Decoupled Vocabulary Learning Enables Zero-Shot Translation from Unseen Languages [55.157295899188476]
neural machine translation systems learn to map sentences of different languages into a common representation space.
In this work, we test this hypothesis by zero-shot translating from unseen languages.
We demonstrate that this setup enables zero-shot translation from entirely unseen languages.
arXiv Detail & Related papers (2024-08-05T07:58:58Z) - Improving Multilingual Neural Machine Translation by Utilizing Semantic and Linguistic Features [18.76505158652759]
We propose to exploit both semantic and linguistic features between multiple languages to enhance multilingual translation.
On the encoder side, we introduce a disentangling learning task that aligns encoder representations by disentangling semantic and linguistic features.
On the decoder side, we leverage a linguistic encoder to integrate low-level linguistic features to assist in the target language generation.
arXiv Detail & Related papers (2024-08-02T17:10:12Z) - Towards a Deep Understanding of Multilingual End-to-End Speech
Translation [52.26739715012842]
We analyze representations learnt in a multilingual end-to-end speech translation model trained over 22 languages.
We derive three major findings from our analysis.
arXiv Detail & Related papers (2023-10-31T13:50:55Z) - Automatic Discrimination of Human and Neural Machine Translation in
Multilingual Scenarios [4.631167282648452]
We tackle the task of automatically discriminating between human and machine translations.
We perform experiments in a multilingual setting, considering multiple languages and multilingual pretrained language models.
arXiv Detail & Related papers (2023-05-31T11:41:24Z) - The Reality of Multi-Lingual Machine Translation [3.183845608678763]
"The Reality of Multi-Lingual Machine Translation" discusses the benefits and perils of using more than two languages in machine translation systems.
Author: Machine translation is for us a prime example of deep learning applications.
arXiv Detail & Related papers (2022-02-25T16:44:06Z) - Towards Zero-shot Language Modeling [90.80124496312274]
We construct a neural model that is inductively biased towards learning human languages.
We infer this distribution from a sample of typologically diverse training languages.
We harness additional language-specific side information as distant supervision for held-out languages.
arXiv Detail & Related papers (2021-08-06T23:49:18Z) - Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer [101.58431011820755]
We study gender bias in multilingual embeddings and how it affects transfer learning for NLP applications.
We create a multilingual dataset for bias analysis and propose several ways for quantifying bias in multilingual representations.
arXiv Detail & Related papers (2020-05-02T04:34:37Z) - Bridging Linguistic Typology and Multilingual Machine Translation with
Multi-View Language Representations [83.27475281544868]
We use singular vector canonical correlation analysis to study what kind of information is induced from each source.
We observe that our representations embed typology and strengthen correlations with language relationships.
We then take advantage of our multi-view language vector space for multilingual machine translation, where we achieve competitive overall translation accuracy.
arXiv Detail & Related papers (2020-04-30T16:25:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.