Prompting PaLM for Translation: Assessing Strategies and Performance
- URL: http://arxiv.org/abs/2211.09102v3
- Date: Sun, 25 Jun 2023 16:51:52 GMT
- Title: Prompting PaLM for Translation: Assessing Strategies and Performance
- Authors: David Vilar, Markus Freitag, Colin Cherry, Jiaming Luo, Viresh
Ratnakar, George Foster
- Abstract summary: pathways language model (PaLM) has demonstrated the strongest machine translation (MT) performance among similarly-trained LLMs to date.
We revisit previous assessments of PaLM's MT capabilities with more recent test sets, modern MT metrics, and human evaluation, and find that its performance, while impressive, still lags that of state-of-the-art supervised systems.
- Score: 16.73524055296411
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) that have been trained on multilingual but not
parallel text exhibit a remarkable ability to translate between languages. We
probe this ability in an in-depth study of the pathways language model (PaLM),
which has demonstrated the strongest machine translation (MT) performance among
similarly-trained LLMs to date. We investigate various strategies for choosing
translation examples for few-shot prompting, concluding that example quality is
the most important factor. Using optimized prompts, we revisit previous
assessments of PaLM's MT capabilities with more recent test sets, modern MT
metrics, and human evaluation, and find that its performance, while impressive,
still lags that of state-of-the-art supervised systems. We conclude by
providing an analysis of PaLM's MT output which reveals some interesting
properties and prospects for future work.
Related papers
- TasTe: Teaching Large Language Models to Translate through Self-Reflection [82.83958470745381]
Large language models (LLMs) have exhibited remarkable performance in various natural language processing tasks.
We propose the TasTe framework, which stands for translating through self-reflection.
The evaluation results in four language directions on the WMT22 benchmark reveal the effectiveness of our approach compared to existing methods.
arXiv Detail & Related papers (2024-06-12T17:21:21Z) - MT-PATCHER: Selective and Extendable Knowledge Distillation from Large Language Models for Machine Translation [61.65537912700187]
Large Language Models (LLM) have demonstrated their strong ability in the field of machine translation (MT)
We propose a framework called MT-Patcher, which transfers knowledge from LLMs to existing MT models in a selective, comprehensive and proactive manner.
arXiv Detail & Related papers (2024-03-14T16:07:39Z) - Adapting Large Language Models for Document-Level Machine Translation [46.370862171452444]
Large language models (LLMs) have significantly advanced various natural language processing (NLP) tasks.
Recent research indicates that moderately-sized LLMs often outperform larger ones after task-specific fine-tuning.
This study focuses on adapting LLMs for document-level machine translation (DocMT) for specific language pairs.
arXiv Detail & Related papers (2024-01-12T09:29:13Z) - Improving Machine Translation with Large Language Models: A Preliminary Study with Cooperative Decoding [73.32763904267186]
Large Language Models (LLMs) present the potential for achieving superior translation quality.
We propose Cooperative Decoding (CoDec) which treats NMT systems as a pretranslation model and MT-oriented LLMs as a supplemental solution.
arXiv Detail & Related papers (2023-11-06T03:41:57Z) - Exploring Human-Like Translation Strategy with Large Language Models [93.49333173279508]
Large language models (LLMs) have demonstrated impressive capabilities in general scenarios.
This work proposes the MAPS framework, which stands for Multi-Aspect Prompting and Selection.
We employ a selection mechanism based on quality estimation to filter out noisy and unhelpful knowledge.
arXiv Detail & Related papers (2023-05-06T19:03:12Z) - Document-Level Machine Translation with Large Language Models [91.03359121149595]
Large language models (LLMs) can produce coherent, cohesive, relevant, and fluent answers for various natural language processing (NLP) tasks.
This paper provides an in-depth evaluation of LLMs' ability on discourse modeling.
arXiv Detail & Related papers (2023-04-05T03:49:06Z) - Evaluating and Improving the Coreference Capabilities of Machine
Translation Models [30.60934078720647]
Machine translation requires a wide range of linguistic capabilities.
Current end-to-end models are expected to learn implicitly by observing aligned sentences in bilingual corpora.
arXiv Detail & Related papers (2023-02-16T18:16:09Z) - Dictionary-based Phrase-level Prompting of Large Language Models for
Machine Translation [91.57514888410205]
Large language models (LLMs) demonstrate remarkable machine translation (MT) abilities via prompting.
LLMs can struggle to translate inputs with rare words, which are common in low resource or domain transfer scenarios.
We show that LLM prompting can provide an effective solution for rare words as well, by using prior knowledge from bilingual dictionaries to provide control hints in the prompts.
arXiv Detail & Related papers (2023-02-15T18:46:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.