Investigating Massive Multilingual Pre-Trained Machine Translation
Models for Clinical Domain via Transfer Learning
- URL: http://arxiv.org/abs/2210.06068v2
- Date: Sun, 4 Jun 2023 20:42:19 GMT
- Title: Investigating Massive Multilingual Pre-Trained Machine Translation
Models for Clinical Domain via Transfer Learning
- Authors: Lifeng Han, Gleb Erofeev, Irina Sorokina, Serge Gladkoff, Goran
Nenadic
- Abstract summary: This work investigates whether MMPLMs can be applied to clinical domain machine translation (MT) towards entirely unseen languages via transfer learning.
Massively multilingual pre-trained language models (MMPLMs) are developed in recent years demonstrating superpowers and the pre-knowledge they acquire for downstream tasks.
- Score: 11.571189144910521
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Massively multilingual pre-trained language models (MMPLMs) are developed in
recent years demonstrating superpowers and the pre-knowledge they acquire for
downstream tasks. This work investigates whether MMPLMs can be applied to
clinical domain machine translation (MT) towards entirely unseen languages via
transfer learning. We carry out an experimental investigation using Meta-AI's
MMPLMs ``wmt21-dense-24-wide-en-X and X-en (WMT21fb)'' which were pre-trained
on 7 language pairs and 14 translation directions including English to Czech,
German, Hausa, Icelandic, Japanese, Russian, and Chinese, and the opposite
direction. We fine-tune these MMPLMs towards English-\textit{Spanish} language
pair which \textit{did not exist at all} in their original pre-trained corpora
both implicitly and explicitly. We prepare carefully aligned \textit{clinical}
domain data for this fine-tuning, which is different from their original mixed
domain knowledge. Our experimental result shows that the fine-tuning is very
successful using just 250k well-aligned in-domain EN-ES segments for three
sub-task translation testings: clinical cases, clinical terms, and ontology
concepts. It achieves very close evaluation scores to another MMPLM NLLB from
Meta-AI, which included Spanish as a high-resource setting in the pre-training.
To the best of our knowledge, this is the first work on using MMPLMs towards
\textit{clinical domain transfer-learning NMT} successfully for totally unseen
languages during pre-training.
Related papers
- TasTe: Teaching Large Language Models to Translate through Self-Reflection [82.83958470745381]
Large language models (LLMs) have exhibited remarkable performance in various natural language processing tasks.
We propose the TasTe framework, which stands for translating through self-reflection.
The evaluation results in four language directions on the WMT22 benchmark reveal the effectiveness of our approach compared to existing methods.
arXiv Detail & Related papers (2024-06-12T17:21:21Z) - Neural Machine Translation of Clinical Text: An Empirical Investigation
into Multilingual Pre-Trained Language Models and Transfer-Learning [6.822926897514793]
Experimental results on three subtasks including 1) clinical case (CC), 2) clinical terminology (CT), and 3) ontological concept (OC)
Our models achieved top-level performances in the ClinSpEn-2022 shared task on English-Spanish clinical domain data.
The transfer learning method works well in our experimental setting using the WMT21fb model to accommodate a new language space Spanish.
arXiv Detail & Related papers (2023-12-12T13:26:42Z) - KBioXLM: A Knowledge-anchored Biomedical Multilingual Pretrained
Language Model [37.69464822182714]
Most biomedical pretrained language models are monolingual and cannot handle the growing cross-lingual requirements.
We propose a model called KBioXLM, which transforms the multilingual pretrained model XLM-R into the biomedical domain using a knowledge-anchored approach.
arXiv Detail & Related papers (2023-11-20T07:02:35Z) - LERT: A Linguistically-motivated Pre-trained Language Model [67.65651497173998]
We propose LERT, a pre-trained language model that is trained on three types of linguistic features along with the original pre-training task.
We carried out extensive experiments on ten Chinese NLU tasks, and the experimental results show that LERT could bring significant improvements.
arXiv Detail & Related papers (2022-11-10T05:09:16Z) - Can Domains Be Transferred Across Languages in Multi-Domain Multilingual
Neural Machine Translation? [52.27798071809941]
This paper investigates whether the domain information can be transferred across languages on the composition of multi-domain and multilingual NMT.
We find that multi-domain multilingual (MDML) NMT can boost zero-shot translation performance up to +10 gains on BLEU.
arXiv Detail & Related papers (2022-10-20T23:13:54Z) - Improving the Lexical Ability of Pretrained Language Models for
Unsupervised Neural Machine Translation [127.81351683335143]
Cross-lingual pretraining requires models to align the lexical- and high-level representations of the two languages.
Previous research has shown that this is because the representations are not sufficiently aligned.
In this paper, we enhance the bilingual masked language model pretraining with lexical-level information by using type-level cross-lingual subword embeddings.
arXiv Detail & Related papers (2021-03-18T21:17:58Z) - Complete Multilingual Neural Machine Translation [44.98358050355681]
We study the use of multi-way aligned examples to enrich the original English-centric parallel corpora.
We call MNMT with such connectivity pattern complete Multilingual Neural Machine Translation (cMNMT)
In combination with a novel training data sampling strategy that is conditioned on the target language only, cMNMT yields competitive translation quality for all language pairs.
arXiv Detail & Related papers (2020-10-20T13:03:48Z) - Pre-training Multilingual Neural Machine Translation by Leveraging
Alignment Information [72.2412707779571]
mRASP is an approach to pre-train a universal multilingual neural machine translation model.
We carry out experiments on 42 translation directions across a diverse setting, including low, medium, rich resource, and as well as transferring to exotic language pairs.
arXiv Detail & Related papers (2020-10-07T03:57:54Z) - Reusing a Pretrained Language Model on Languages with Limited Corpora
for Unsupervised NMT [129.99918589405675]
We present an effective approach that reuses an LM that is pretrained only on the high-resource language.
The monolingual LM is fine-tuned on both languages and is then used to initialize a UNMT model.
Our approach, RE-LM, outperforms a competitive cross-lingual pretraining model (XLM) in English-Macedonian (En-Mk) and English-Albanian (En-Sq)
arXiv Detail & Related papers (2020-09-16T11:37:10Z) - El Departamento de Nosotros: How Machine Translated Corpora Affects
Language Models in MRC Tasks [0.12183405753834563]
Pre-training large-scale language models (LMs) requires huge amounts of text corpora.
We study the caveats of applying directly translated corpora for fine-tuning LMs for downstream natural language processing tasks.
We show that careful curation along with post-processing lead to improved performance and overall LMs robustness.
arXiv Detail & Related papers (2020-07-03T22:22:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.