SemiAdapt and SemiLoRA: Efficient Domain Adaptation for Transformer-based Low-Resource Language Translation with a Case Study on Irish
- URL: http://arxiv.org/abs/2510.18725v1
- Date: Tue, 21 Oct 2025 15:24:15 GMT
- Title: SemiAdapt and SemiLoRA: Efficient Domain Adaptation for Transformer-based Low-Resource Language Translation with a Case Study on Irish
- Authors: Josh McGiff, Nikola S. Nikolov,
- Abstract summary: Fine-tuning is widely used to tailor large language models for specific tasks such as neural machine translation (NMT)<n>We introduce Semi-efficient fine-tuning (PEFT) that bridges the gap by training on a fraction of the original model parameters.<n>We demonstrate that SemiAdapt can outperform full-domain fine-tuning, while most notably, SemiLoRA can propel PEFT methods to match or even outperform full-model fine-tuning.
- Score: 0.3437656066916039
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fine-tuning is widely used to tailor large language models for specific tasks such as neural machine translation (NMT). However, leveraging transfer learning is computationally expensive when fine-tuning large multilingual models with billions of parameters, thus creating a barrier to entry for researchers working on low-resource domains such as Irish translation. Parameter-efficient fine-tuning (PEFT) bridges this gap by training on a fraction of the original model parameters, with the Low-Rank Adaptation (LoRA) approach introducing small, trainable adapter layers. We introduce SemiAdapt and SemiLoRA as semi-supervised inference-efficient approaches that strengthen domain adaptation and lead to improved overall performance in NMT. We demonstrate that SemiAdapt can outperform full-domain fine-tuning, while most notably, SemiLoRA can propel PEFT methods to match or even outperform full-model fine-tuning. We further evaluate domain-by-dataset fine-tuning and demonstrate that our embedding-based inference methods perform especially well on larger and noisier corpora. All Irish translation models developed in this work are released as open resources. These methods aim to make high-quality domain adaptation and fine-tuning more accessible to researchers working with low-resource languages.
Related papers
- Consensus-Aligned Neuron Efficient Fine-Tuning Large Language Models for Multi-Domain Machine Translation [26.913797159189787]
Multi-domain machine translation (MDMT) aims to build a unified model capable of translating content across diverse domains.<n> domain adaptation remains a challenge for large language models (LLMs)<n>We propose a neuron-efficient fine-tuning framework for MDMT that identifies and updates consensus-aligned neurons within LLMs.
arXiv Detail & Related papers (2026-02-05T14:20:59Z) - Beyond Vanilla Fine-Tuning: Leveraging Multistage, Multilingual, and Domain-Specific Methods for Low-Resource Machine Translation [1.9639956888747314]
This paper contributes to artificial intelligence by proposing two approaches for adapting large language models (msLLMs)<n>As an application in engineering, these methods are implemented in NMT systems for Sinhala, Tamil, and English (six language pairs) in domain-specific, extremely low-resource settings.<n>Our experiments reveal that these approaches enhance translation performance by an average of +1.47 bilingual evaluation understudy (BLEU) score compared to the standard single-stage fine-tuning baseline.
arXiv Detail & Related papers (2025-03-28T16:30:28Z) - Domain-Specific Translation with Open-Source Large Language Models: Resource-Oriented Analysis [0.0]
We compare the domain-specific translation performance of open-source autoregressive decoder-only large language models (LLMs) with task-oriented machine translation (MT) models.<n>Our experiments focus on the medical domain and cover four language directions with varied resource availability.
arXiv Detail & Related papers (2024-12-08T08:54:13Z) - How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario [72.02391485962127]
Speech Self-Supervised Learning (SSL) models achieve impressive performance on Automatic Speech Recognition (ASR)<n>In low-resource language ASR, they encounter the domain mismatch problem between pre-trained and low-resource languages.<n>We extend a conventional efficient fine-tuning scheme based on the adapter to handle these issues.
arXiv Detail & Related papers (2024-11-27T10:51:00Z) - Large Language Model for Multi-Domain Translation: Benchmarking and Domain CoT Fine-tuning [55.107329995417786]
Large language models (LLMs) have demonstrated impressive general understanding and generation abilities.
We establish a benchmark for multi-domain translation, featuring 25 German$Leftrightarrow$English and 22 Chinese$Leftrightarrow$English test sets.
We propose a domain Chain of Thought (CoT) fine-tuning technique that utilizes the intrinsic multi-domain intelligence of LLMs to improve translation performance.
arXiv Detail & Related papers (2024-10-03T16:15:04Z) - Leveraging Parameter Efficient Training Methods for Low Resource Text Classification: A Case Study in Marathi [0.4194295877935868]
We present a study of PEFT methods for the Indic low-resource language Marathi.
These approaches are evaluated on prominent text classification datasets like MahaSent, MahaHate, and MahaNews.
We show that these methods are competitive with full fine-tuning and can be used without loss in accuracy.
arXiv Detail & Related papers (2024-08-06T13:16:16Z) - Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation [3.558269947618352]
We evaluate the performance of 8 PEFT methods with in total of 15 architectures using the SacreBLEU score.
We showed that 6 PEFT architectures outperform the baseline for both in-domain and out-domain tests.
The Houlsby+Inversion adapter has the best performance overall, proving the effectiveness of PEFT methods.
arXiv Detail & Related papers (2024-04-05T16:42:28Z) - Non-Parametric Unsupervised Domain Adaptation for Neural Machine
Translation [61.27321597981737]
$k$NN-MT has shown the promising capability of directly incorporating the pre-trained neural machine translation (NMT) model with domain-specific token-level $k$-nearest-neighbor retrieval.
We propose a novel framework that directly uses in-domain monolingual sentences in the target language to construct an effective datastore for $k$-nearest-neighbor retrieval.
arXiv Detail & Related papers (2021-09-14T11:50:01Z) - Improving Multilingual Translation by Representation and Gradient
Regularization [82.42760103045083]
We propose a joint approach to regularize NMT models at both representation-level and gradient-level.
Our results demonstrate that our approach is highly effective in both reducing off-target translation occurrences and improving zero-shot translation performance.
arXiv Detail & Related papers (2021-09-10T10:52:21Z) - Pre-training Multilingual Neural Machine Translation by Leveraging
Alignment Information [72.2412707779571]
mRASP is an approach to pre-train a universal multilingual neural machine translation model.
We carry out experiments on 42 translation directions across a diverse setting, including low, medium, rich resource, and as well as transferring to exotic language pairs.
arXiv Detail & Related papers (2020-10-07T03:57:54Z) - A Simple Baseline to Semi-Supervised Domain Adaptation for Machine
Translation [73.3550140511458]
State-of-the-art neural machine translation (NMT) systems are data-hungry and perform poorly on new domains with no supervised data.
We propose a simple but effect approach to the semi-supervised domain adaptation scenario of NMT.
This approach iteratively trains a Transformer-based NMT model via three training objectives: language modeling, back-translation, and supervised translation.
arXiv Detail & Related papers (2020-01-22T16:42:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.