Rethinking Human-like Translation Strategy: Integrating Drift-Diffusion
Model with Large Language Models for Machine Translation
- URL: http://arxiv.org/abs/2402.10699v1
- Date: Fri, 16 Feb 2024 14:00:56 GMT
- Title: Rethinking Human-like Translation Strategy: Integrating Drift-Diffusion
Model with Large Language Models for Machine Translation
- Authors: Hongbin Na, Zimu Wang, Mieradilijiang Maimaiti, Tong Chen, Wei Wang,
Tao Shen, Ling Chen
- Abstract summary: We propose Thinker with the Drift-Diffusion Model to emulate human translators' dynamic decision-making under constrained resources.
We conduct experiments under the high-resource, low-resource, and commonsense translation settings using the WMT22 and CommonMT datasets.
We also perform additional analysis and evaluation on commonsense translation to illustrate the high effectiveness and efficacy of the proposed method.
- Score: 15.333148705267012
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) have demonstrated promising potential in various
downstream tasks, including machine translation. However, prior work on
LLM-based machine translation has mainly focused on better utilizing training
data, demonstrations, or pre-defined and universal knowledge to improve
performance, with a lack of consideration of decision-making like human
translators. In this paper, we incorporate Thinker with the Drift-Diffusion
Model (Thinker-DDM) to address this issue. We then redefine the Drift-Diffusion
process to emulate human translators' dynamic decision-making under constrained
resources. We conduct extensive experiments under the high-resource,
low-resource, and commonsense translation settings using the WMT22 and CommonMT
datasets, in which Thinker-DDM outperforms baselines in the first two
scenarios. We also perform additional analysis and evaluation on commonsense
translation to illustrate the high effectiveness and efficacy of the proposed
method.
Related papers
- TasTe: Teaching Large Language Models to Translate through Self-Reflection [82.83958470745381]
Large language models (LLMs) have exhibited remarkable performance in various natural language processing tasks.
We propose the TasTe framework, which stands for translating through self-reflection.
The evaluation results in four language directions on the WMT22 benchmark reveal the effectiveness of our approach compared to existing methods.
arXiv Detail & Related papers (2024-06-12T17:21:21Z) - Context-Aware Machine Translation with Source Coreference Explanation [26.336947440529713]
We propose a model that explains the decisions made for translation by predicting coreference features in the input.
We evaluate our method in the WMT document-level translation task of English-German dataset, the English-Russian dataset, and the multilingual TED talk dataset.
arXiv Detail & Related papers (2024-04-30T12:41:00Z) - Lost in the Source Language: How Large Language Models Evaluate the Quality of Machine Translation [64.5862977630713]
This study investigates how Large Language Models (LLMs) leverage source and reference data in machine translation evaluation task.
We find that reference information significantly enhances the evaluation accuracy, while surprisingly, source information sometimes is counterproductive.
arXiv Detail & Related papers (2024-01-12T13:23:21Z) - Towards Effective Disambiguation for Machine Translation with Large
Language Models [65.80775710657672]
We study the capabilities of large language models to translate "ambiguous sentences"
Experiments show that our methods can match or outperform state-of-the-art systems such as DeepL and NLLB in four out of five language directions.
arXiv Detail & Related papers (2023-09-20T22:22:52Z) - On the Pareto Front of Multilingual Neural Machine Translation [123.94355117635293]
We study how the performance of a given direction changes with its sampling ratio in Neural Machine Translation (MNMT)
We propose the Double Power Law to predict the unique performance trade-off front in MNMT.
In our experiments, it achieves better performance than temperature searching and gradient manipulation methods with only 1/5 to 1/2 of the total training budget.
arXiv Detail & Related papers (2023-04-06T16:49:19Z) - Distributionally Robust Multilingual Machine Translation [94.51866646879337]
We propose a new learning objective for Multilingual neural machine translation (MNMT) based on distributionally robust optimization.
We show how to practically optimize this objective for large translation corpora using an iterated best response scheme.
Our method consistently outperforms strong baseline methods in terms of average and per-language performance under both many-to-one and one-to-many translation settings.
arXiv Detail & Related papers (2021-09-09T03:48:35Z) - Unsupervised Neural Machine Translation for Low-Resource Domains via
Meta-Learning [27.86606560170401]
We present a novel meta-learning algorithm for unsupervised neural machine translation (UNMT)
We train the model to adapt to another domain by utilizing only a small amount of training data.
Our model surpasses a transfer learning-based approach by up to 2-4 BLEU scores.
arXiv Detail & Related papers (2020-10-18T17:54:13Z) - Language Model Prior for Low-Resource Neural Machine Translation [85.55729693003829]
We propose a novel approach to incorporate a LM as prior in a neural translation model (TM)
We add a regularization term, which pushes the output distributions of the TM to be probable under the LM prior.
Results on two low-resource machine translation datasets show clear improvements even with limited monolingual data.
arXiv Detail & Related papers (2020-04-30T16:29:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.