Continuous Learning in Neural Machine Translation using Bilingual
Dictionaries
- URL: http://arxiv.org/abs/2102.06558v1
- Date: Fri, 12 Feb 2021 14:46:13 GMT
- Title: Continuous Learning in Neural Machine Translation using Bilingual
Dictionaries
- Authors: Jan Niehues
- Abstract summary: We propose an evaluation framework to assess the ability of neural machine translation to continuously learn new phrases.
By addressing both challenges we are able to improve the ability to translate new, rare words and phrases from 30% to up to 70%.
- Score: 14.058642647656301
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While recent advances in deep learning led to significant improvements in
machine translation, neural machine translation is often still not able to
continuously adapt to the environment. For humans, as well as for machine
translation, bilingual dictionaries are a promising knowledge source to
continuously integrate new knowledge. However, their exploitation poses several
challenges: The system needs to be able to perform one-shot learning as well as
model the morphology of source and target language.
In this work, we proposed an evaluation framework to assess the ability of
neural machine translation to continuously learn new phrases. We integrate
one-shot learning methods for neural machine translation with different word
representations and show that it is important to address both in order to
successfully make use of bilingual dictionaries. By addressing both challenges
we are able to improve the ability to translate new, rare words and phrases
from 30% to up to 70%. The correct lemma is even generated by more than 90%.
Related papers
- Decoupled Vocabulary Learning Enables Zero-Shot Translation from Unseen Languages [55.157295899188476]
neural machine translation systems learn to map sentences of different languages into a common representation space.
In this work, we test this hypothesis by zero-shot translating from unseen languages.
We demonstrate that this setup enables zero-shot translation from entirely unseen languages.
arXiv Detail & Related papers (2024-08-05T07:58:58Z) - Extending Multilingual Machine Translation through Imitation Learning [60.15671816513614]
Imit-MNMT treats the task as an imitation learning process, which mimicks the behavior of an expert.
We show that our approach significantly improves the translation performance between the new and the original languages.
We also demonstrate that our approach is capable of solving copy and off-target problems.
arXiv Detail & Related papers (2023-11-14T21:04:03Z) - The Best of Both Worlds: Combining Human and Machine Translations for
Multilingual Semantic Parsing with Active Learning [50.320178219081484]
We propose an active learning approach that exploits the strengths of both human and machine translations.
An ideal utterance selection can significantly reduce the error and bias in the translated data.
arXiv Detail & Related papers (2023-05-22T05:57:47Z) - Informative Language Representation Learning for Massively Multilingual
Neural Machine Translation [47.19129812325682]
In a multilingual neural machine translation model, an artificial language token is usually used to guide translation into the desired target language.
Recent studies show that prepending language tokens sometimes fails to navigate the multilingual neural machine translation models into right translation directions.
We propose two methods, language embedding embodiment and language-aware multi-head attention, to learn informative language representations to channel translation into right directions.
arXiv Detail & Related papers (2022-09-04T04:27:17Z) - No Language Left Behind: Scaling Human-Centered Machine Translation [69.28110770760506]
We create datasets and models aimed at narrowing the performance gap between low and high-resource languages.
We propose multiple architectural and training improvements to counteract overfitting while training on thousands of tasks.
Our model achieves an improvement of 44% BLEU relative to the previous state-of-the-art.
arXiv Detail & Related papers (2022-07-11T07:33:36Z) - The Reality of Multi-Lingual Machine Translation [3.183845608678763]
"The Reality of Multi-Lingual Machine Translation" discusses the benefits and perils of using more than two languages in machine translation systems.
Author: Machine translation is for us a prime example of deep learning applications.
arXiv Detail & Related papers (2022-02-25T16:44:06Z) - Extremely low-resource machine translation for closely related languages [0.0]
This work focuses on closely related languages from the Uralic language family: from Estonian and Finnish.
We find that multilingual learning and synthetic corpora increase the translation quality in every language pair.
We show that transfer learning and fine-tuning are very effective for doing low-resource machine translation and achieve the best results.
arXiv Detail & Related papers (2021-05-27T11:27:06Z) - Improving Cross-Lingual Reading Comprehension with Self-Training [62.73937175625953]
Current state-of-the-art models even surpass human performance on several benchmarks.
Previous works have revealed the abilities of pre-trained multilingual models for zero-shot cross-lingual reading comprehension.
This paper further utilized unlabeled data to improve the performance.
arXiv Detail & Related papers (2021-05-08T08:04:30Z) - CSTNet: Contrastive Speech Translation Network for Self-Supervised
Speech Representation Learning [11.552745999302905]
More than half of the 7,000 languages in the world are in imminent danger of going extinct.
It is relatively easy to obtain textual translations corresponding to speech.
We construct a convolutional neural network audio encoder capable of extracting linguistic representations from speech.
arXiv Detail & Related papers (2020-06-04T12:21:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.