ParroT: Translating during Chat using Large Language Models tuned with
Human Translation and Feedback
- URL: http://arxiv.org/abs/2304.02426v5
- Date: Thu, 2 Nov 2023 07:44:52 GMT
- Title: ParroT: Translating during Chat using Large Language Models tuned with
Human Translation and Feedback
- Authors: Wenxiang Jiao, Jen-tse Huang, Wenxuan Wang, Zhiwei He, Tian Liang,
Xing Wang, Shuming Shi, Zhaopeng Tu
- Abstract summary: ParroT is a framework to enhance and regulate the translation abilities during chat.
Specifically, ParroT reformulates translation data into the instruction-following style.
We propose three instruction types for finetuning ParroT models, including translation instruction, contrastive instruction, and error-guided instruction.
- Score: 90.20262941911027
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) like ChatGPT have exhibited remarkable abilities
on a wide range of natural language processing~(NLP) tasks, including various
machine translation abilities accomplished during chat. However, these models
are only accessible through restricted APIs, which creates barriers to new
research and advancements in the field. Therefore, we propose ParroT, a
framework to enhance and regulate the translation abilities during chat based
on open-source LLMs (e.g., LLaMA), human-written translation and feedback data.
Specifically, ParroT reformulates translation data into the
instruction-following style, and introduces a "$\mathbf{Hint}$" field for
incorporating extra requirements to regulate the translation process.
Accordingly, we propose three instruction types for finetuning ParroT models,
including translation instruction, contrastive instruction, and error-guided
instruction. Experiments on Flores subsets and WMT22 test sets suggest that
translation instruction improves the translation performance of vanilla LLMs
significantly while error-guided instruction can lead to further improvement,
which demonstrates the importance of learning from low-quality translations
annotated by humans. We also demonstrate the potential of automatic evaluation
tools in providing quality information of translations, when constructing
error-guided instructions for directions that lack human annotation data.
Please refer to our Github project for more implementation details:
https://github.com/wxjiao/ParroT
Related papers
- InstaTrans: An Instruction-Aware Translation Framework for Non-English Instruction Datasets [2.530471185132544]
It is challenging to generate high-quality instruction datasets for non-English languages due to tail phenomena.
We propose translating existing high-quality English instruction datasets as a solution.
We introduce a new translation framework tailored for instruction datasets, named InstaTrans.
arXiv Detail & Related papers (2024-10-02T13:02:23Z) - TasTe: Teaching Large Language Models to Translate through Self-Reflection [82.83958470745381]
Large language models (LLMs) have exhibited remarkable performance in various natural language processing tasks.
We propose the TasTe framework, which stands for translating through self-reflection.
The evaluation results in four language directions on the WMT22 benchmark reveal the effectiveness of our approach compared to existing methods.
arXiv Detail & Related papers (2024-06-12T17:21:21Z) - Building Accurate Translation-Tailored LLMs with Language Aware Instruction Tuning [57.323716555996114]
Off-target translation remains an unsolved problem, especially for low-resource languages.
Recent works have either designed advanced prompting strategies to highlight the functionality of translation instructions or exploited the in-context learning ability of LLMs.
In this work, we design a two-stage fine-tuning algorithm to improve the instruction-following ability (especially the translation direction) of LLMs.
arXiv Detail & Related papers (2024-03-21T13:47:40Z) - Contextual Refinement of Translations: Large Language Models for Sentence and Document-Level Post-Editing [12.843274390224853]
Large Language Models (LLM's) have demonstrated considerable success in various Natural Language Processing tasks.
We show that they have yet to attain state-of-the-art performance in Neural Machine Translation.
We propose adapting LLM's as Automatic Post-Editors (APE) rather than direct translators.
arXiv Detail & Related papers (2023-10-23T12:22:15Z) - TIM: Teaching Large Language Models to Translate with Comparison [78.66926087162672]
We propose a novel framework using examples in comparison to teach LLMs to learn translation.
Our approach involves presenting the model with examples of correct and incorrect translations and using a preference loss to guide the model's learning.
Our findings offer a new perspective on fine-tuning LLMs for translation tasks and provide a promising solution for generating high-quality translations.
arXiv Detail & Related papers (2023-07-10T08:15:40Z) - Eliciting the Translation Ability of Large Language Models via Multilingual Finetuning with Translation Instructions [68.01449013641532]
Large-scale Pretrained Language Models (LLMs) have shown strong abilities in multilingual translations.
We present a detailed analysis by finetuning a multilingual pretrained language model, XGLM-7B, to perform multilingual translation.
arXiv Detail & Related papers (2023-05-24T12:00:24Z) - Adaptive Machine Translation with Large Language Models [7.803471587734353]
We investigate how we can utilize in-context learning to improve real-time adaptive machine translation.
We conduct experiments across five diverse language pairs, namely English-to-Arabic (EN-AR), English-to-Chinese (EN-ZH), English-to-French (EN-FR), English-to-Kinyarwanda (EN-RW), and English-to-Spanish (EN-ES)
arXiv Detail & Related papers (2023-01-30T21:17:15Z) - ChrEnTranslate: Cherokee-English Machine Translation Demo with Quality
Estimation and Corrective Feedback [70.5469946314539]
ChrEnTranslate is an online machine translation demonstration system for translation between English and an endangered language Cherokee.
It supports both statistical and neural translation models as well as provides quality estimation to inform users of reliability.
arXiv Detail & Related papers (2021-07-30T17:58:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.