adaptMLLM: Fine-Tuning Multilingual Language Models on Low-Resource
Languages with Integrated LLM Playgrounds
- URL: http://arxiv.org/abs/2403.02370v1
- Date: Mon, 4 Mar 2024 14:49:18 GMT
- Title: adaptMLLM: Fine-Tuning Multilingual Language Models on Low-Resource
Languages with Integrated LLM Playgrounds
- Authors: S\'eamus Lankford, Haithem Afli and Andy Way
- Abstract summary: adaptMLLM is an open-source tool for fine-tuning Multilingual Language Models (MLLMs) for Machine Translation (MT)
It offers a range of metrics for model evaluation and the capability to deploy models as a translation service directly within the application.
The adaptMLLM system demonstrated significant improvements compared with baselines from the LoResMT 2021 Shared Task.
- Score: 2.648836772989769
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The advent of Multilingual Language Models (MLLMs) and Large Language Models
has spawned innovation in many areas of natural language processing. Despite
the exciting potential of this technology, its impact on developing
high-quality Machine Translation (MT) outputs for low-resource languages
remains relatively under-explored. Furthermore, an open-source application,
dedicated to both fine-tuning MLLMs and managing the complete MT workflow for
low-resources languages, remains unavailable. We aim to address these
imbalances through the development of adaptMLLM, which streamlines all
processes involved in the fine-tuning of MLLMs for MT. This open-source
application is tailored for developers, translators, and users who are engaged
in MT. An intuitive interface allows for easy customisation of hyperparameters,
and the application offers a range of metrics for model evaluation and the
capability to deploy models as a translation service directly within the
application. As a multilingual tool, we used adaptMLLM to fine-tune models for
two low-resource language pairs: English to Irish (EN$\leftrightarrow$GA) and
English to Marathi (EN$\leftrightarrow$MR). Compared with baselines from the
LoResMT2021 Shared Task, the adaptMLLM system demonstrated significant
improvements. In the EN$\rightarrow$GA direction, an improvement of 5.2 BLEU
points was observed and an increase of 40.5 BLEU points was recorded in the
GA$\rightarrow$EN direction. Significant improvements in the translation
performance of the EN$\leftrightarrow$MR pair were also observed notably in the
MR$\rightarrow$EN direction with an increase of 21.3 BLEU points. Finally, a
fine-grained human evaluation of the MLLM output on the EN$\rightarrow$GA pair
was conducted using the Multidimensional Quality Metrics and Scalar Quality
Metrics error taxonomies. The application and models are freely available.
Related papers
- Think Carefully and Check Again! Meta-Generation Unlocking LLMs for Low-Resource Cross-Lingual Summarization [108.6908427615402]
Cross-lingual summarization ( CLS) aims to generate a summary for the source text in a different target language.
Currently, instruction-tuned large language models (LLMs) excel at various English tasks.
Recent studies have shown that LLMs' performance on CLS tasks remains unsatisfactory even with few-shot settings.
arXiv Detail & Related papers (2024-10-26T00:39:44Z) - LLaVA-KD: A Framework of Distilling Multimodal Large Language Models [70.19607283302712]
We propose a novel framework to transfer knowledge from l-MLLM to s-MLLM.
Specifically, we introduce Multimodal Distillation (MDist) to minimize the divergence between the visual-textual output distributions of l-MLLM and s-MLLM.
We also propose a three-stage training scheme to fully exploit the potential of s-MLLM.
arXiv Detail & Related papers (2024-10-21T17:41:28Z) - X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale [25.257770733168012]
Large language models (LLMs) have achieved remarkable success across various NLP tasks, yet their focus has predominantly been on English.
In this paper, we prioritize quality over scaling number of languages, with a focus on multilingual machine translation task.
X-ALMA is a model designed with a commitment to ensuring top-tier performance across 50 diverse languages, regardless of their resource levels.
arXiv Detail & Related papers (2024-10-04T03:17:27Z) - Quality or Quantity? On Data Scale and Diversity in Adapting Large Language Models for Low-Resource Translation [62.202893186343935]
We explore what it would take to adapt Large Language Models for low-resource languages.
We show that parallel data is critical during both pre-training andSupervised Fine-Tuning (SFT)
Our experiments with three LLMs across two low-resourced language groups reveal consistent trends, underscoring the generalizability of our findings.
arXiv Detail & Related papers (2024-08-23T00:59:38Z) - Unlocking the Potential of Model Merging for Low-Resource Languages [66.7716891808697]
Adapting large language models to new languages typically involves continual pre-training (CT) followed by supervised fine-tuning (SFT)
We propose model merging as an alternative for low-resource languages, combining models with distinct capabilities into a single model without additional training.
Experiments based on Llama-2-7B demonstrate that model merging effectively endows LLMs for low-resource languages with task-solving abilities, outperforming CT-then-SFT in scenarios with extremely scarce data.
arXiv Detail & Related papers (2024-07-04T15:14:17Z) - Self-Distillation for Model Stacking Unlocks Cross-Lingual NLU in 200+ Languages [2.53740603524637]
Machine translation models (MT) produce excellent multilingual representations, resulting in strong translation performance even for low-resource languages.
In this work, we get the best both worlds by integrating MT encoders directly into language backbones via sample-efficient self-distillation.
The resulting MT-LLMs preserve the inherent multilingual representational alignment from the MT encoder, allowing lower-resource languages to tap into the rich knowledge embedded in English-centric LLMs.
arXiv Detail & Related papers (2024-06-18T16:00:20Z) - Enhancing Neural Machine Translation of Low-Resource Languages: Corpus
Development, Human Evaluation and Explainable AI Architectures [0.0]
The Transformer architecture stands out as the gold standard, especially for high-resource language pairs.
The scarcity of parallel datasets for low-resource languages can hinder machine translation development.
This thesis introduces adaptNMT and adaptMLLM, two open-source applications streamlined for the development, fine-tuning, and deployment of neural machine translation models.
arXiv Detail & Related papers (2024-03-03T18:08:30Z) - Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts [75.33019401706188]
Large language models (LLMs) are known to effectively perform tasks by simply observing few exemplars.
We propose to assemble synthetic exemplars from a diverse set of high-resource languages to prompt the LLMs to translate from any language into English.
Our unsupervised prompting method performs on par with supervised few-shot learning in LLMs of different sizes for translations between English and 13 Indic and 21 African low-resource languages.
arXiv Detail & Related papers (2023-06-20T08:27:47Z) - Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis [103.89753784762445]
Large language models (LLMs) have demonstrated remarkable potential in handling multilingual machine translation (MMT)
This paper systematically investigates the advantages and challenges of LLMs for MMT.
We thoroughly evaluate eight popular LLMs, including ChatGPT and GPT-4.
arXiv Detail & Related papers (2023-04-10T15:51:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.