adaptMLLM: Fine-Tuning Multilingual Language Models on Low-Resource
Languages with Integrated LLM Playgrounds
- URL: http://arxiv.org/abs/2403.02370v1
- Date: Mon, 4 Mar 2024 14:49:18 GMT
- Title: adaptMLLM: Fine-Tuning Multilingual Language Models on Low-Resource
Languages with Integrated LLM Playgrounds
- Authors: S\'eamus Lankford, Haithem Afli and Andy Way
- Abstract summary: adaptMLLM is an open-source tool for fine-tuning Multilingual Language Models (MLLMs) for Machine Translation (MT)
It offers a range of metrics for model evaluation and the capability to deploy models as a translation service directly within the application.
The adaptMLLM system demonstrated significant improvements compared with baselines from the LoResMT 2021 Shared Task.
- Score: 2.648836772989769
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The advent of Multilingual Language Models (MLLMs) and Large Language Models
has spawned innovation in many areas of natural language processing. Despite
the exciting potential of this technology, its impact on developing
high-quality Machine Translation (MT) outputs for low-resource languages
remains relatively under-explored. Furthermore, an open-source application,
dedicated to both fine-tuning MLLMs and managing the complete MT workflow for
low-resources languages, remains unavailable. We aim to address these
imbalances through the development of adaptMLLM, which streamlines all
processes involved in the fine-tuning of MLLMs for MT. This open-source
application is tailored for developers, translators, and users who are engaged
in MT. An intuitive interface allows for easy customisation of hyperparameters,
and the application offers a range of metrics for model evaluation and the
capability to deploy models as a translation service directly within the
application. As a multilingual tool, we used adaptMLLM to fine-tune models for
two low-resource language pairs: English to Irish (EN$\leftrightarrow$GA) and
English to Marathi (EN$\leftrightarrow$MR). Compared with baselines from the
LoResMT2021 Shared Task, the adaptMLLM system demonstrated significant
improvements. In the EN$\rightarrow$GA direction, an improvement of 5.2 BLEU
points was observed and an increase of 40.5 BLEU points was recorded in the
GA$\rightarrow$EN direction. Significant improvements in the translation
performance of the EN$\leftrightarrow$MR pair were also observed notably in the
MR$\rightarrow$EN direction with an increase of 21.3 BLEU points. Finally, a
fine-grained human evaluation of the MLLM output on the EN$\rightarrow$GA pair
was conducted using the Multidimensional Quality Metrics and Scalar Quality
Metrics error taxonomies. The application and models are freely available.
Related papers
- Unlocking the Potential of Model Merging for Low-Resource Languages [66.7716891808697]
Adapting large language models to new languages typically involves continual pre-training (CT) followed by supervised fine-tuning (SFT)
We propose model merging as an alternative for low-resource languages, combining models with distinct capabilities into a single model without additional training.
Experiments based on Llama-2-7B demonstrate that model merging effectively endows LLMs for low-resource languages with task-solving abilities, outperforming CT-then-SFT in scenarios with extremely scarce data.
arXiv Detail & Related papers (2024-07-04T15:14:17Z) - Self-Distillation for Model Stacking Unlocks Cross-Lingual NLU in 200+ Languages [2.53740603524637]
Machine translation models (MT) produce excellent multilingual representations, resulting in strong translation performance even for low-resource languages.
In this work, we get the best both worlds by integrating MT encoders directly into language backbones via sample-efficient self-distillation.
The resulting MT-LLMs preserve the inherent multilingual representational alignment from the MT encoder, allowing lower-resource languages to tap into the rich knowledge embedded in English-centric LLMs.
arXiv Detail & Related papers (2024-06-18T16:00:20Z) - Building Accurate Translation-Tailored LLMs with Language Aware Instruction Tuning [57.323716555996114]
Off-target translation remains an unsolved problem, especially for low-resource languages.
Recent works have either designed advanced prompting strategies to highlight the functionality of translation instructions or exploited the in-context learning ability of LLMs.
In this work, we design a two-stage fine-tuning algorithm to improve the instruction-following ability (especially the translation direction) of LLMs.
arXiv Detail & Related papers (2024-03-21T13:47:40Z) - Enhancing Neural Machine Translation of Low-Resource Languages: Corpus
Development, Human Evaluation and Explainable AI Architectures [0.0]
The Transformer architecture stands out as the gold standard, especially for high-resource language pairs.
The scarcity of parallel datasets for low-resource languages can hinder machine translation development.
This thesis introduces adaptNMT and adaptMLLM, two open-source applications streamlined for the development, fine-tuning, and deployment of neural machine translation models.
arXiv Detail & Related papers (2024-03-03T18:08:30Z) - YAYI 2: Multilingual Open-Source Large Language Models [53.92832054643197]
We propose YAYI 2, including both base and chat models, with 30 billion parameters.
YAYI 2 is pre-trained from scratch on a multilingual corpus which contains 2.65 trillion tokens filtered by our pre-training data processing pipeline.
The base model is aligned with human values through supervised fine-tuning with millions of instructions and reinforcement learning from human feedback.
arXiv Detail & Related papers (2023-12-22T17:34:47Z) - MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks [12.665447518524187]
This study aims to perform a thorough evaluation of the non-English capabilities of SoTA LLMs by comparing them on the same set of multilingual datasets.
Our benchmark comprises 22 datasets covering 83 languages, including low-resource African languages.
We also perform a study on data contamination and find that several models are likely to be contaminated with multilingual evaluation benchmarks.
arXiv Detail & Related papers (2023-11-13T16:45:37Z) - Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts [75.33019401706188]
Large language models (LLMs) are known to effectively perform tasks by simply observing few exemplars.
We propose to assemble synthetic exemplars from a diverse set of high-resource languages to prompt the LLMs to translate from any language into English.
Our unsupervised prompting method performs on par with supervised few-shot learning in LLMs of different sizes for translations between English and 13 Indic and 21 African low-resource languages.
arXiv Detail & Related papers (2023-06-20T08:27:47Z) - Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis [103.89753784762445]
Large language models (LLMs) have demonstrated remarkable potential in handling multilingual machine translation (MMT)
This paper systematically investigates the advantages and challenges of LLMs for MMT.
We thoroughly evaluate eight popular LLMs, including ChatGPT and GPT-4.
arXiv Detail & Related papers (2023-04-10T15:51:30Z) - Improving Multilingual Translation by Representation and Gradient
Regularization [82.42760103045083]
We propose a joint approach to regularize NMT models at both representation-level and gradient-level.
Our results demonstrate that our approach is highly effective in both reducing off-target translation occurrences and improving zero-shot translation performance.
arXiv Detail & Related papers (2021-09-10T10:52:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.