TransLLaMa: LLM-based Simultaneous Translation System
- URL: http://arxiv.org/abs/2402.04636v1
- Date: Wed, 7 Feb 2024 07:39:27 GMT
- Title: TransLLaMa: LLM-based Simultaneous Translation System
- Authors: Roman Koshkin, Katsuhito Sudoh and Satoshi Nakamura
- Abstract summary: We show that a Decoder-only large language model (LLMs) can control input segmentation directly by generating a special "wait" token.
This obviates the need for a separate policy and enables the LLM to perform English-German and English-Russian SiMT tasks.
We also evaluated closed-source models such as GPT-4, which displayed encouraging results in performing the SiMT task without prior training.
- Score: 18.27477980076409
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Decoder-only large language models (LLMs) have recently demonstrated
impressive capabilities in text generation and reasoning. Nonetheless, they
have limited applications in simultaneous machine translation (SiMT), currently
dominated by encoder-decoder transformers. This study demonstrates that, after
fine-tuning on a small dataset comprising causally aligned source and target
sentence pairs, a pre-trained open-source LLM can control input segmentation
directly by generating a special "wait" token. This obviates the need for a
separate policy and enables the LLM to perform English-German and
English-Russian SiMT tasks with BLEU scores that are comparable to those of
specific state-of-the-art baselines. We also evaluated closed-source models
such as GPT-4, which displayed encouraging results in performing the SiMT task
without prior training (zero-shot), indicating a promising avenue for enhancing
future SiMT systems.
Related papers
- Towards Zero-Shot Multimodal Machine Translation [64.9141931372384]
We propose a method to bypass the need for fully supervised data to train multimodal machine translation systems.
Our method, called ZeroMMT, consists in adapting a strong text-only machine translation (MT) model by training it on a mixture of two objectives.
To prove that our method generalizes to languages with no fully supervised training data available, we extend the CoMMuTE evaluation dataset to three new languages: Arabic, Russian and Chinese.
arXiv Detail & Related papers (2024-07-18T15:20:31Z) - LLMs Are Zero-Shot Context-Aware Simultaneous Translators [16.260150631363313]
Large language models (LLMs) have come to the spotlight thanks to their generality and strong performance in a wide range of language tasks.
Here we show that open-source LLMs perform on par with or better than some state-of-the-art baselines in simultaneous machine translation (SiMT) tasks.
arXiv Detail & Related papers (2024-06-19T11:57:42Z) - Self-Distillation for Model Stacking Unlocks Cross-Lingual NLU in 200+ Languages [2.53740603524637]
Machine translation models (MT) produce excellent multilingual representations, resulting in strong translation performance even for low-resource languages.
In this work, we get the best both worlds by integrating MT encoders directly into language backbones via sample-efficient self-distillation.
The resulting MT-LLMs preserve the inherent multilingual representational alignment from the MT encoder, allowing lower-resource languages to tap into the rich knowledge embedded in English-centric LLMs.
arXiv Detail & Related papers (2024-06-18T16:00:20Z) - TasTe: Teaching Large Language Models to Translate through Self-Reflection [82.83958470745381]
Large language models (LLMs) have exhibited remarkable performance in various natural language processing tasks.
We propose the TasTe framework, which stands for translating through self-reflection.
The evaluation results in four language directions on the WMT22 benchmark reveal the effectiveness of our approach compared to existing methods.
arXiv Detail & Related papers (2024-06-12T17:21:21Z) - Speech Translation with Large Language Models: An Industrial Practice [64.5419534101104]
We introduce LLM-ST, a novel and effective speech translation model constructed upon a pre-trained large language model (LLM)
By integrating the large language model (LLM) with a speech encoder and employing multi-task instruction tuning, LLM-ST can produce accurate timestamped transcriptions and translations.
Through rigorous experimentation on English and Chinese datasets, we showcase the exceptional performance of LLM-ST.
arXiv Detail & Related papers (2023-12-21T05:32:49Z) - Simul-LLM: A Framework for Exploring High-Quality Simultaneous Translation with Large Language Models [4.873927154453253]
Large language models (LLMs) with billions of parameters and pretrained on massive amounts of data are now capable of near or better than state-of-the-art performance in a variety of downstream natural language processing tasks.
Simul-LLM is the first open-source fine-tuning and evaluation pipeline development framework for LLMs focused on SimulMT.
arXiv Detail & Related papers (2023-12-07T20:42:05Z) - Improving Machine Translation with Large Language Models: A Preliminary Study with Cooperative Decoding [73.32763904267186]
Large Language Models (LLMs) present the potential for achieving superior translation quality.
We propose Cooperative Decoding (CoDec) which treats NMT systems as a pretranslation model and MT-oriented LLMs as a supplemental solution.
arXiv Detail & Related papers (2023-11-06T03:41:57Z) - Simultaneous Machine Translation with Large Language Models [51.470478122113356]
We investigate the possibility of applying Large Language Models to SimulMT tasks.
We conducted experiments using the textttLlama2-7b-chat model on nine different languages from the MUST-C dataset.
The results show that LLM outperforms dedicated MT models in terms of BLEU and LAAL metrics.
arXiv Detail & Related papers (2023-09-13T04:06:47Z) - Language Models are Good Translators [63.528370845657896]
We show that a single language model (LM4MT) can achieve comparable performance with strong encoder-decoder NMT models.
Experiments on pivot-based and zero-shot translation tasks show that LM4MT can outperform the encoder-decoder NMT model by a large margin.
arXiv Detail & Related papers (2021-06-25T13:30:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.