Related papers: Distill, Adapt, Distill: Training Small, In-Domain Models for Neural Machine Translation

Related papers

Commute Your Domains: Trajectory Optimality Criterion for Multi-Domain Learning [50.80758278865274]
In multi-domain learning, a single model is trained on diverse data domains to leverage shared knowledge and improve generalization. The order in which the data from these domains is used for training can significantly affect the model's performance on each domain. We investigate the influence of training order (or data mixing) in multi-domain learning using the concept of Lie bracket of gradient vector fields.
arXiv Detail & Related papers (2025-01-26T15:12:06Z)
Investigating the potential of Sparse Mixtures-of-Experts for multi-domain neural machine translation [59.41178047749177]
We focus on multi-domain Neural Machine Translation, with the goal of developing efficient models which can handle data from various domains seen during training and are robust to domains unseen during training. We hypothesize that Sparse Mixture-of-Experts (SMoE) models are a good fit for this task, as they enable efficient model scaling. We conduct a series of experiments aimed at validating the utility of SMoE for the multi-domain scenario, and find that a straightforward width scaling of Transformer is a simpler and surprisingly more efficient approach in practice, and reaches the same performance level as SMoE.
arXiv Detail & Related papers (2024-07-01T09:45:22Z)
Improving Domain Generalization with Domain Relations [77.63345406973097]
This paper focuses on domain shifts, which occur when the model is applied to new domains that are different from the ones it was trained on. We propose a new approach called D$3$G to learn domain-specific models. Our results show that D$3$G consistently outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-02-06T08:11:16Z)
$m^4Adapter$: Multilingual Multi-Domain Adaptation for Machine Translation with a Meta-Adapter [128.69723410769586]
Multilingual neural machine translation models (MNMT) yield state-of-the-art performance when evaluated on data from a domain and language pair. When a MNMT model is used to translate under domain shift or to a new language pair, performance drops dramatically. We propose $m4Adapter$, which combines domain and language knowledge using meta-learning with adapters.
arXiv Detail & Related papers (2022-10-21T12:25:05Z)
Finding the Right Recipe for Low Resource Domain Adaptation in Neural Machine Translation [7.2283509416724465]
General translation models often struggle to generate accurate translations in specialized domains. We conduct an in-depth empirical exploration of monolingual and parallel data approaches to domain adaptation. Our work includes three domains: consumer electronic, clinical, and biomedical.
arXiv Detail & Related papers (2022-06-02T16:38:33Z)
Improving both domain robustness and domain adaptability in machine translation [69.15496930090403]
We address two problems of domain adaptation in neural machine translation. First, we want to reach domain robustness, i.e., good quality of both domains from the training data. Second, we want our systems to be adaptive, i.e., making it possible to finetune systems with just hundreds of in-domain parallel sentences.
arXiv Detail & Related papers (2021-12-15T17:34:59Z)
Non-Parametric Unsupervised Domain Adaptation for Neural Machine Translation [61.27321597981737]
$k$NN-MT has shown the promising capability of directly incorporating the pre-trained neural machine translation (NMT) model with domain-specific token-level $k$-nearest-neighbor retrieval. We propose a novel framework that directly uses in-domain monolingual sentences in the target language to construct an effective datastore for $k$-nearest-neighbor retrieval.
arXiv Detail & Related papers (2021-09-14T11:50:01Z)
Domain Adaptation and Multi-Domain Adaptation for Neural Machine Translation: A Survey [9.645196221785694]
We focus on robust approaches to domain adaptation for Neural Machine Translation (NMT) models. In particular, we look at the case where a system may need to translate sentences from multiple domains. We highlight the benefits of domain adaptation and multi-domain adaptation techniques to other lines of NMT research.
arXiv Detail & Related papers (2021-04-14T16:21:37Z)
Pruning-then-Expanding Model for Domain Adaptation of Neural Machine Translation [9.403585397617865]
Domain adaptation is widely used in practical applications of neural machine translation. The existing methods for domain adaptation usually suffer from catastrophic forgetting, domain divergence, and model explosion. We propose a method of "divide and conquer" which is based on the importance of neurons or parameters in the translation model.
arXiv Detail & Related papers (2021-03-25T08:57:09Z)
Unsupervised Neural Machine Translation for Low-Resource Domains via Meta-Learning [27.86606560170401]
We present a novel meta-learning algorithm for unsupervised neural machine translation (UNMT) We train the model to adapt to another domain by utilizing only a small amount of training data. Our model surpasses a transfer learning-based approach by up to 2-4 BLEU scores.
arXiv Detail & Related papers (2020-10-18T17:54:13Z)
Building a Multi-domain Neural Machine Translation Model using Knowledge Distillation [0.0]
Lack of specialized data makes building a multi-domain neural machine translation tool challenging. We propose a new training pipeline where knowledge distillation and multiple specialized teachers allow us to efficiently finetune a model.
arXiv Detail & Related papers (2020-04-15T20:21:19Z)
A Simple Baseline to Semi-Supervised Domain Adaptation for Machine Translation [73.3550140511458]
State-of-the-art neural machine translation (NMT) systems are data-hungry and perform poorly on new domains with no supervised data. We propose a simple but effect approach to the semi-supervised domain adaptation scenario of NMT. This approach iteratively trains a Transformer-based NMT model via three training objectives: language modeling, back-translation, and supervised translation.
arXiv Detail & Related papers (2020-01-22T16:42:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.