Revamping Multilingual Agreement Bidirectionally via Switched
Back-translation for Multilingual Neural Machine Translation
- URL: http://arxiv.org/abs/2209.13940v3
- Date: Mon, 15 May 2023 09:26:42 GMT
- Title: Revamping Multilingual Agreement Bidirectionally via Switched
Back-translation for Multilingual Neural Machine Translation
- Authors: Hongyuan Lu, Haoyang Huang, Shuming Ma, Dongdong Zhang, Furu Wei, Wai
Lam
- Abstract summary: multilingual agreement (MA) has shown its importance for multilingual neural machine translation (MNMT)
We present textbfBidirectional textbfMultilingual textbfAgreement via textbfSwitched textbfBack-textbftranslation (textbfBMA-SBT)
It is a novel and universal multilingual agreement framework for fine-tuning pre-trained MNMT models.
- Score: 107.83158521848372
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the fact that multilingual agreement (MA) has shown its importance
for multilingual neural machine translation (MNMT), current methodologies in
the field have two shortages: (i) require parallel data between multiple
language pairs, which is not always realistic and (ii) optimize the agreement
in an ambiguous direction, which hampers the translation performance. We
present \textbf{B}idirectional \textbf{M}ultilingual \textbf{A}greement via
\textbf{S}witched \textbf{B}ack-\textbf{t}ranslation (\textbf{BMA-SBT}), a
novel and universal multilingual agreement framework for fine-tuning
pre-trained MNMT models, which (i) exempts the need for aforementioned parallel
data by using a novel method called switched BT that creates synthetic text
written in another source language using the translation target and (ii)
optimizes the agreement bidirectionally with the Kullback-Leibler Divergence
loss. Experiments indicate that BMA-SBT clearly improves the strong baselines
on the task of MNMT with three benchmarks: TED Talks, News, and Europarl.
In-depth analyzes indicate that BMA-SBT brings additive improvements to the
conventional BT method.
Related papers
- LLaVA-NeuMT: Selective Layer-Neuron Modulation for Efficient Multilingual Multimodal Translation [12.51212639515934]
LLaVA-NeuMT is a novel framework that explicitly models language-specific and language-agnostic representations to mitigate multilingual interference.<n>Our approach consists of a layer selection mechanism that identifies the most informative layers for different language pairs.<n>We conduct extensive experiments on the M3-Multi30K and M3-AmbigCaps datasets, demonstrating that LLaVA-NeuMT, while fine-tuning only 40% of the model parameters, surpasses full fine-tuning approaches.
arXiv Detail & Related papers (2025-07-25T04:23:24Z) - Data Augmentation With Back translation for Low Resource languages: A case of English and Luganda [0.0]
We explore the application of Back translation as a semi-supervised technique to enhance Neural Machine Translation models for the English-Luganda language pair.<n>Our methodology involves developing custom NMT models using both publicly available and web-crawled data, and applying Iterative and Incremental Back translation techniques.<n>The results of our study show significant improvements, with translation performance for the English-Luganda pair exceeding previous benchmarks by more than 10 BLEU score units across all translation directions.
arXiv Detail & Related papers (2025-05-05T08:47:52Z) - Trans-Zero: Self-Play Incentivizes Large Language Models for Multilingual Translation Without Parallel Data [64.4458540273004]
We propose a self-play framework that leverages only monolingual data and the intrinsic multilingual knowledge of Large Language Models (LLMs)<n>Experiments demonstrate that this approach not only matches the performance of models trained on large-scale parallel data but also excels in non-English translation directions.
arXiv Detail & Related papers (2025-04-20T16:20:30Z) - LANDeRMT: Detecting and Routing Language-Aware Neurons for Selectively Finetuning LLMs to Machine Translation [43.26446958873554]
Large language models (LLMs) have shown promising results in multilingual translation even with limited bilingual supervision.
Recent advancements in large language models (LLMs) have shown promising results in multilingual translation even with limited bilingual supervision.
LandeRMT is a framework that selectively finetunes LLMs to textbfMachine textbfTranslation with diverse translation training data.
arXiv Detail & Related papers (2024-09-29T02:39:42Z) - Towards Zero-Shot Multimodal Machine Translation [64.9141931372384]
We propose a method to bypass the need for fully supervised data to train multimodal machine translation systems.<n>Our method, called ZeroMMT, consists in adapting a strong text-only machine translation (MT) model by training it on a mixture of two objectives.<n>To prove that our method generalizes to languages with no fully supervised training data available, we extend the CoMMuTE evaluation dataset to three new languages: Arabic, Russian and Chinese.
arXiv Detail & Related papers (2024-07-18T15:20:31Z) - A Novel Paradigm Boosting Translation Capabilities of Large Language Models [11.537249547487045]
The paper proposes a novel paradigm consisting of three stages: Secondary Pre-training using Extensive Monolingual Data, Continual Pre-training with Interlinear Text Format Documents, and Leveraging Source-Language Consistent Instruction for Supervised Fine-Tuning.
Experimental results conducted using the Llama2 model, particularly on Chinese-Llama2, demonstrate the improved translation capabilities of LLMs.
arXiv Detail & Related papers (2024-03-18T02:53:49Z) - ACT-MNMT Auto-Constriction Turning for Multilingual Neural Machine
Translation [38.30649186517611]
This issue introduces an textbfunderlineAuto-textbfunderlineConstriction textbfunderlineTurning mechanism for textbfunderlineMultilingual textbfunderlineNeural textbfunderlineMachine textbfunderlineTranslation (model)
arXiv Detail & Related papers (2024-03-11T14:10:57Z) - VECO 2.0: Cross-lingual Language Model Pre-training with
Multi-granularity Contrastive Learning [56.47303426167584]
We propose a cross-lingual pre-trained model VECO2.0 based on contrastive learning with multi-granularity alignments.
Specifically, the sequence-to-sequence alignment is induced to maximize the similarity of the parallel pairs and minimize the non-parallel pairs.
token-to-token alignment is integrated to bridge the gap between synonymous tokens excavated via the thesaurus dictionary from the other unpaired tokens in a bilingual instance.
arXiv Detail & Related papers (2023-04-17T12:23:41Z) - Beyond Triplet: Leveraging the Most Data for Multimodal Machine
Translation [53.342921374639346]
Multimodal machine translation aims to improve translation quality by incorporating information from other modalities, such as vision.
Previous MMT systems mainly focus on better access and use of visual information and tend to validate their methods on image-related datasets.
This paper establishes new methods and new datasets for MMT.
arXiv Detail & Related papers (2022-12-20T15:02:38Z) - Non-Parametric Domain Adaptation for End-to-End Speech Translation [72.37869362559212]
End-to-End Speech Translation (E2E-ST) has received increasing attention due to the potential of its less error propagation, lower latency, and fewer parameters.
We propose a novel non-parametric method that leverages domain-specific text translation corpus to achieve domain adaptation for the E2E-ST system.
arXiv Detail & Related papers (2022-05-23T11:41:02Z) - Contrastive Learning for Many-to-many Multilingual Neural Machine
Translation [16.59039088482523]
Existing multilingual machine translation approaches mainly focus on English-centric directions.
We aim to build a many-to-many translation system with an emphasis on the quality of non-English language directions.
arXiv Detail & Related papers (2021-05-20T03:59:45Z) - Unsupervised Bitext Mining and Translation via Self-trained Contextual
Embeddings [51.47607125262885]
We describe an unsupervised method to create pseudo-parallel corpora for machine translation (MT) from unaligned text.
We use multilingual BERT to create source and target sentence embeddings for nearest-neighbor search and adapt the model via self-training.
We validate our technique by extracting parallel sentence pairs on the BUCC 2017 bitext mining task and observe up to a 24.5 point increase (absolute) in F1 scores over previous unsupervised methods.
arXiv Detail & Related papers (2020-10-15T14:04:03Z) - Multilingual Denoising Pre-training for Neural Machine Translation [132.66750663226287]
mBART is a sequence-to-sequence denoising auto-encoder pre-trained on large-scale monolingual corpora.
mBART is one of the first methods for pre-training a complete sequence-to-sequence model.
arXiv Detail & Related papers (2020-01-22T18:59:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.