Related papers: Conversational SimulMT: Efficient Simultaneous Translation with Large Language Models

Conversational SimulMT: Efficient Simultaneous Translation with Large Language Models

URL: http://arxiv.org/abs/2402.10552v3
Date: Fri, 21 Jun 2024 07:18:13 GMT
Title: Conversational SimulMT: Efficient Simultaneous Translation with Large Language Models
Authors: Minghan Wang, Thuy-Trang Vu, Yuxia Wang, Ehsan Shareghi, Gholamreza Haffari,
Abstract summary: Simultaneous machine translation (SimulMT) presents a challenging trade-off between translation quality and latency. We propose a conversational SimulMT framework to enhance the inference efficiency of LLM-based SimulMT.
Score: 40.5451418216014
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Simultaneous machine translation (SimulMT) presents a challenging trade-off between translation quality and latency. Recent studies have shown that LLMs can achieve good performance in SimulMT tasks. However, this often comes at the expense of high inference cost and latency. In this paper, we propose a conversational SimulMT framework to enhance the inference efficiency of LLM-based SimulMT through multi-turn-dialogue-based decoding. Our experiments with Llama2-7b-chat on two SimulMT benchmarks demonstrate the superiority of LLM in translation quality while achieving comparable computational latency to specialized SimulMT models.

Related papers

Combining the Best of Both Worlds: A Method for Hybrid NMT and LLM Translation [12.59407158733001]
Large language model (LLM) shows promising performances in a variety of downstream tasks, such as machine translation (MT)<n>However, using LLMs for translation suffers from high computational costs and significant latency.<n>We propose a novel and straightforward decider that leverages source sentence features.
arXiv Detail & Related papers (2025-05-19T06:50:52Z)
Trans-Zero: Self-Play Incentivizes Large Language Models for Multilingual Translation Without Parallel Data [64.4458540273004]
We propose a self-play framework that leverages only monolingual data and the intrinsic multilingual knowledge of Large Language Models (LLMs) Experiments demonstrate that this approach not only matches the performance of models trained on large-scale parallel data but also excels in non-English translation directions.
arXiv Detail & Related papers (2025-04-20T16:20:30Z)
Efficient and Adaptive Simultaneous Speech Translation with Fully Unidirectional Architecture [14.056534007451763]
Simultaneous speech translation (SimulST) produces translations incrementally while processing partial speech input. Existing LLM-based SimulST approaches incur significant computational overhead due to repeated encoding of bidirectional speech encoder. We introduce Efficient and Adaptive Simultaneous Speech Translation (EASiST) with fully unidirectional architecture.
arXiv Detail & Related papers (2025-04-16T06:46:15Z)
LLMs Can Achieve High-quality Simultaneous Machine Translation as Efficiently as Offline [16.124385656402744]
Large Language Models (LLMs) perform excellently in offline machine translation even with a simple prompt "Translate the following sentence from [src lang] into [tgt lang]:" We propose a novel paradigm that includes constructing supervised fine-tuning data for simultaneous machine translation (SiMT) Our approach achieves state-of-the-art performance across various SiMT benchmarks, and preserves the original abilities of offline translation.
arXiv Detail & Related papers (2025-04-13T13:45:53Z)
TasTe: Teaching Large Language Models to Translate through Self-Reflection [82.83958470745381]
Large language models (LLMs) have exhibited remarkable performance in various natural language processing tasks. We propose the TasTe framework, which stands for translating through self-reflection. The evaluation results in four language directions on the WMT22 benchmark reveal the effectiveness of our approach compared to existing methods.
arXiv Detail & Related papers (2024-06-12T17:21:21Z)
Speech Translation with Large Language Models: An Industrial Practice [64.5419534101104]
We introduce LLM-ST, a novel and effective speech translation model constructed upon a pre-trained large language model (LLM) By integrating the large language model (LLM) with a speech encoder and employing multi-task instruction tuning, LLM-ST can produce accurate timestamped transcriptions and translations. Through rigorous experimentation on English and Chinese datasets, we showcase the exceptional performance of LLM-ST.
arXiv Detail & Related papers (2023-12-21T05:32:49Z)
Simul-LLM: A Framework for Exploring High-Quality Simultaneous Translation with Large Language Models [4.873927154453253]
Large language models (LLMs) with billions of parameters and pretrained on massive amounts of data are now capable of near or better than state-of-the-art performance in a variety of downstream natural language processing tasks. Simul-LLM is the first open-source fine-tuning and evaluation pipeline development framework for LLMs focused on SimulMT.
arXiv Detail & Related papers (2023-12-07T20:42:05Z)
On-the-Fly Fusion of Large Language Models and Machine Translation [3.718665608549311]
We propose the on-the-fly ensembling of a machine translation model with an LLM prompted on the same task and input. We find that a slightly weaker-at-translation LLM can improve translations of a NMT model, and ensembling with an LLM can produce better translations than ensembling two stronger MT models.
arXiv Detail & Related papers (2023-11-14T16:49:33Z)
Improving Machine Translation with Large Language Models: A Preliminary Study with Cooperative Decoding [73.32763904267186]
Large Language Models (LLMs) present the potential for achieving superior translation quality. We propose Cooperative Decoding (CoDec) which treats NMT systems as a pretranslation model and MT-oriented LLMs as a supplemental solution.
arXiv Detail & Related papers (2023-11-06T03:41:57Z)
Simultaneous Machine Translation with Large Language Models [51.470478122113356]
We investigate the possibility of applying Large Language Models to SimulMT tasks. We conducted experiments using the textttLlama2-7b-chat model on nine different languages from the MUST-C dataset. The results show that LLM outperforms dedicated MT models in terms of BLEU and LAAL metrics.
arXiv Detail & Related papers (2023-09-13T04:06:47Z)
Augmenting Large Language Model Translators via Translation Memories [32.28138249566329]
Using translation memories (TMs) as prompts is a promising approach to in-context learning of machine translation models. We take a step towards prompting large language models (LLMs) with TMs and making them better translators.
arXiv Detail & Related papers (2023-05-27T04:47:09Z)
Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis [103.89753784762445]
Large language models (LLMs) have demonstrated remarkable potential in handling multilingual machine translation (MMT) This paper systematically investigates the advantages and challenges of LLMs for MMT. We thoroughly evaluate eight popular LLMs, including ChatGPT and GPT-4.
arXiv Detail & Related papers (2023-04-10T15:51:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.