Improving Translation Faithfulness of Large Language Models via
Augmenting Instructions
- URL: http://arxiv.org/abs/2308.12674v1
- Date: Thu, 24 Aug 2023 09:32:29 GMT
- Title: Improving Translation Faithfulness of Large Language Models via
Augmenting Instructions
- Authors: Yijie Chen, Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie Zhou
- Abstract summary: We propose SWIE (Segment-Weighted Instruction Embedding) and an instruction-following dataset OVERMISS.
SWIE improves the model instruction understanding by adding a global instruction representation on the following input and response representations.
OVERMISS improves model faithfulness by comparing over-translation and miss-translation results with the correct translation.
- Score: 89.76691340615848
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) present strong general capabilities, and a
current compelling challenge is stimulating their specialized capabilities,
such as machine translation, through low-cost instruction tuning. The standard
instruction-following data is sequentially organized as the concatenation of an
instruction, an input, and a response. As the attention mechanism of LLMs has
limitations on local focus, LLMs tend to focus more on the words or sentences
nearby at each position. This leads to a high risk of instruction forgetting
during decoding. To alleviate the above issues, We propose SWIE
(Segment-Weighted Instruction Embedding) and an instruction-following dataset
OVERMISS. SWIE improves the model instruction understanding by adding a global
instruction representation on the following input and response representations.
OVERMISS improves model faithfulness by comparing over-translation and
miss-translation results with the correct translation. We apply our methods to
two main-stream open-source LLMs, BLOOM and LLaMA. The experimental results
demonstrate significant improvements in translation performance with SWIE based
on BLOOMZ-3b, particularly in zero-shot and long text translations due to
reduced instruction forgetting risk. Additionally, OVERMISS outperforms the
baseline in translation performance (e.g. an increase in BLEU scores from 0.69
to 3.12 and an average improvement of 0.48 percentage comet scores for
LLaMA-7b) with further enhancements seen in models combining OVERMISS and SWIE
(e.g. the BLUE scores increase up to 0.56 from English to German across three
different backbones), and both exhibit improvements in the faithfulness metric
based on word alignment.
Related papers
- Multi-IF: Benchmarking LLMs on Multi-Turn and Multilingual Instructions Following [51.18383180774354]
We introduce Multi-IF, a new benchmark designed to assess Large Language Models' proficiency in following multi-turn and multilingual instructions.
Our evaluation of 14 state-of-the-art LLMs on Multi-IF reveals that it presents a significantly more challenging task than existing benchmarks.
languages with non-Latin scripts (Hindi, Russian, and Chinese) generally exhibit higher error rates, suggesting potential limitations in the models' multilingual capabilities.
arXiv Detail & Related papers (2024-10-21T00:59:47Z) - LLM-based Translation Inference with Iterative Bilingual Understanding [45.00660558229326]
We propose a novel Iterative Bilingual Understanding Translation method based on the cross-lingual capabilities of large language models (LLMs)
The cross-lingual capability of LLMs enables the generation of contextual understanding for both the source and target languages separately.
The proposed IBUT outperforms several strong comparison methods.
arXiv Detail & Related papers (2024-10-16T13:21:46Z) - How Much Data is Enough Data? Fine-Tuning Large Language Models for In-House Translation: Performance Evaluation Across Multiple Dataset Sizes [2.0109318570325847]
We investigate the impact of fine-tuning the Llama 3 model using TMs from a specific organisation in the software sector.
We fine-tune separate models for each training set and evaluate their performance based on automatic metrics, BLEU, chrF++, TER, and COMET.
Our findings reveal improvement in translation performance with larger datasets across all metrics.
arXiv Detail & Related papers (2024-09-05T12:06:38Z) - TasTe: Teaching Large Language Models to Translate through Self-Reflection [82.83958470745381]
Large language models (LLMs) have exhibited remarkable performance in various natural language processing tasks.
We propose the TasTe framework, which stands for translating through self-reflection.
The evaluation results in four language directions on the WMT22 benchmark reveal the effectiveness of our approach compared to existing methods.
arXiv Detail & Related papers (2024-06-12T17:21:21Z) - Building Accurate Translation-Tailored LLMs with Language Aware Instruction Tuning [57.323716555996114]
Off-target translation remains an unsolved problem, especially for low-resource languages.
Recent works have either designed advanced prompting strategies to highlight the functionality of translation instructions or exploited the in-context learning ability of LLMs.
In this work, we design a two-stage fine-tuning algorithm to improve the instruction-following ability (especially the translation direction) of LLMs.
arXiv Detail & Related papers (2024-03-21T13:47:40Z) - TEaR: Improving LLM-based Machine Translation with Systematic Self-Refinement [26.26493253161022]
Large Language Models (LLMs) have achieved impressive results in Machine Translation (MT)
We introduce a systematic LLM-based self-refinement translation framework, named textbfTEaR.
arXiv Detail & Related papers (2024-02-26T07:58:12Z) - SCALE: Synergized Collaboration of Asymmetric Language Translation
Engines [105.8983433641208]
We introduce a collaborative framework that connects compact Specialized Translation Models (STMs) and general-purpose Large Language Models (LLMs) as one unified translation engine.
By introducing translation from STM into the triplet in-context demonstrations, SCALE unlocks refinement and pivoting ability of LLM.
Our experiments show that SCALE significantly outperforms both few-shot LLMs (GPT-4) and specialized models (NLLB) in challenging low-resource settings.
arXiv Detail & Related papers (2023-09-29T08:46:38Z) - BayLing: Bridging Cross-lingual Alignment and Instruction Following
through Interactive Translation for Large Language Models [39.03467441090675]
Large language models (LLMs) have demonstrated remarkable prowess in language understanding and generation.
We have developed BayLing, an instruction-following LLM by utilizing LLaMA as the foundation LLM.
Demo, homepage, code and models of BayLing are available.
arXiv Detail & Related papers (2023-06-19T14:30:52Z) - Improving Multilingual Translation by Representation and Gradient
Regularization [82.42760103045083]
We propose a joint approach to regularize NMT models at both representation-level and gradient-level.
Our results demonstrate that our approach is highly effective in both reducing off-target translation occurrences and improving zero-shot translation performance.
arXiv Detail & Related papers (2021-09-10T10:52:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.