Exploiting Domain-Specific Parallel Data on Multilingual Language Models for Low-resource Language Translation
- URL: http://arxiv.org/abs/2412.19522v1
- Date: Fri, 27 Dec 2024 08:25:52 GMT
- Title: Exploiting Domain-Specific Parallel Data on Multilingual Language Models for Low-resource Language Translation
- Authors: Surangika Ranathungaa, Shravan Nayak, Shih-Ting Cindy Huang, Yanke Mao, Tong Su, Yun-Hsiang Ray Chan, Songchen Yuan, Anthony Rinaldi, Annie En-Shiun Lee,
- Abstract summary: We present an evaluation of the effectiveness of parallel data from auxiliary domains in building domain-specific NMT models.
We explore the impact of domain divergence on NMT model performance.
We recommend several strategies for utilizing auxiliary parallel data in building domain-specific NMT models.
- Score: 0.6467856992131628
- License:
- Abstract: Neural Machine Translation (NMT) systems built on multilingual sequence-to-sequence Language Models (msLMs) fail to deliver expected results when the amount of parallel data for a language, as well as the language's representation in the model are limited. This restricts the capabilities of domain-specific NMT systems for low-resource languages (LRLs). As a solution, parallel data from auxiliary domains can be used either to fine-tune or to further pre-train the msLM. We present an evaluation of the effectiveness of these two techniques in the context of domain-specific LRL-NMT. We also explore the impact of domain divergence on NMT model performance. We recommend several strategies for utilizing auxiliary parallel data in building domain-specific NMT models for LRLs.
Related papers
- Large Language Model for Multi-Domain Translation: Benchmarking and Domain CoT Fine-tuning [55.107329995417786]
Large language models (LLMs) have demonstrated impressive general understanding and generation abilities.
We establish a benchmark for multi-domain translation, featuring 25 German$Leftrightarrow$English and 22 Chinese$Leftrightarrow$English test sets.
We propose a domain Chain of Thought (CoT) fine-tuning technique that utilizes the intrinsic multi-domain intelligence of LLMs to improve translation performance.
arXiv Detail & Related papers (2024-10-03T16:15:04Z) - Quality or Quantity? On Data Scale and Diversity in Adapting Large Language Models for Low-Resource Translation [62.202893186343935]
We explore what it would take to adapt Large Language Models for low-resource languages.
We show that parallel data is critical during both pre-training andSupervised Fine-Tuning (SFT)
Our experiments with three LLMs across two low-resourced language groups reveal consistent trends, underscoring the generalizability of our findings.
arXiv Detail & Related papers (2024-08-23T00:59:38Z) - Leveraging Auxiliary Domain Parallel Data in Intermediate Task
Fine-tuning for Low-resource Translation [6.583246002638354]
Intermediate-task fine-tuning (ITFT) of PMSS models is extremely beneficial for domain-specific NMT.
We quantify the domain-specific results variations using a domain-divergence test, and show that ITFT can mitigate the impact of domain divergence to some extent.
arXiv Detail & Related papers (2023-06-02T09:05:18Z) - Unified Model Learning for Various Neural Machine Translation [63.320005222549646]
Existing machine translation (NMT) studies mainly focus on developing dataset-specific models.
We propose a versatile'' model, i.e., the Unified Model Learning for NMT (UMLNMT) that works with data from different tasks.
OurNMT results in substantial improvements over dataset-specific models with significantly reduced model deployment costs.
arXiv Detail & Related papers (2023-05-04T12:21:52Z) - Exploiting Multilingualism in Low-resource Neural Machine Translation
via Adversarial Learning [3.2258463207097017]
Generative Adversarial Networks (GAN) offer a promising approach for Neural Machine Translation (NMT)
In GAN, similar to bilingual models, multilingual NMT only considers one reference translation for each sentence during model training.
This article proposes Denoising Adversarial Auto-encoder-based Sentence Interpolation (DAASI) approach to perform sentence computation.
arXiv Detail & Related papers (2023-03-31T12:34:14Z) - Exploiting Language Relatedness in Machine Translation Through Domain
Adaptation Techniques [3.257358540764261]
We present a novel approach of using a scaled similarity score of sentences, especially for related languages based on a 5-gram KenLM language model.
Our approach succeeds in increasing 2 BLEU point on multi-domain approach, 3 BLEU point on fine-tuning for NMT and 2 BLEU point on iterative back-translation approach.
arXiv Detail & Related papers (2023-03-03T09:07:30Z) - Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language
Models [106.65127123304842]
Branch-Train-Merge (BTM) is an efficient algorithm for parallel training of large language models (LLMs)
BTM learns a set of independent expert LMs (ELMs), each specialized to a different textual domain.
Experiments show that BTM improves in- and out-of-domain perplexities as compared to GPT-style Transformer LMs.
arXiv Detail & Related papers (2022-08-05T17:46:38Z) - DaLC: Domain Adaptation Learning Curve Prediction for Neural Machine
Translation [10.03007605098947]
Domain Adaptation (DA) of Neural Machine Translation (NMT) model often relies on a pre-trained general NMT model which is adapted to the new domain on a sample of in-domain parallel data.
We propose a Domain Learning Curve prediction (DaLC) model that predicts prospective DA performance based on in-domain monolingual samples in the source language.
arXiv Detail & Related papers (2022-04-20T06:57:48Z) - Improving Target-side Lexical Transfer in Multilingual Neural Machine
Translation [104.10726545151043]
multilingual data has been found more beneficial for NMT models that translate from the LRL to a target language than the ones that translate into the LRLs.
Our experiments show that DecSDE leads to consistent gains of up to 1.8 BLEU on translation from English to four different languages.
arXiv Detail & Related papers (2020-10-04T19:42:40Z) - A Simple Baseline to Semi-Supervised Domain Adaptation for Machine
Translation [73.3550140511458]
State-of-the-art neural machine translation (NMT) systems are data-hungry and perform poorly on new domains with no supervised data.
We propose a simple but effect approach to the semi-supervised domain adaptation scenario of NMT.
This approach iteratively trains a Transformer-based NMT model via three training objectives: language modeling, back-translation, and supervised translation.
arXiv Detail & Related papers (2020-01-22T16:42:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.