Related papers: Improving LLM Abilities in Idiomatic Translation

Improving LLM Abilities in Idiomatic Translation

URL: http://arxiv.org/abs/2407.03518v4
Date: Thu, 23 Jan 2025 04:04:22 GMT
Title: Improving LLM Abilities in Idiomatic Translation
Authors: Sundesh Donthi, Maximilian Spencer, Om Patel, Joon Doh, Eid Rodan, Kevin Zhu, Sean O'Brien,
Abstract summary: For language models (LLMs) like NLLB and GPT, translating idioms remains a challenge.<n>Our goal is to enhance translation fidelity by improving LLM processing of idiomatic language.<n>This has a significant social impact, as it preserves cultural nuances and ensures translated texts retain intent and emotional resonance.
Score: 2.8692611791027893
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: For large language models (LLMs) like NLLB and GPT, translating idioms remains a challenge. Our goal is to enhance translation fidelity by improving LLM processing of idiomatic language while preserving the original linguistic style. This has a significant social impact, as it preserves cultural nuances and ensures translated texts retain their intent and emotional resonance, fostering better cross-cultural communication. Previous work has utilized knowledge bases like IdiomKB by providing the LLM with the meaning of an idiom to use in translation. Although this method yielded better results than a direct translation, it is still limited in its ability to preserve idiomatic writing style across languages. In this research, we expand upon the knowledge base to find corresponding idioms in the target language. Our research performs translations using two methods: The first method employs the SentenceTransformers model to semantically generate cosine similarity scores between the meanings of the original and target language idioms, selecting the best idiom (Cosine Similarity method). The second method uses an LLM to find a corresponding idiom in the target language for use in the translation (LLM-generated idiom method). As a baseline, we performed a direct translation without providing additional information. Human evaluations on the English -> Chinese, and Chinese -> English show the Cosine Similarity Lookup method out-performed others in all GPT4o translations. To further build upon IdiomKB, we developed a low-resource Urdu dataset containing Urdu idioms and their translations. Despite dataset limitations, the Cosine Similarity Lookup method shows promise, potentially overcoming language barriers and enabling the exploration of diverse literary works in Chinese and Urdu.(LoResLM @ COLING Preprint)

Related papers

Graph-Assisted Culturally Adaptable Idiomatic Translation for Indic Languages [3.2498796510544636]
Translating multi-word expressions (MWEs) and idioms requires a deep understanding of both the source and target languages.<n>Traditional static knowledge graphs (KGs) and prompt-based approaches struggle to capture these complex relationships.<n>We propose an adaptive graph neural network (GNN) based methodology that learns intricate mappings between idiomatic expressions.
arXiv Detail & Related papers (2025-05-28T03:42:16Z)
Large Language Models for Persian $ \leftrightarrow $ English Idiom Translation [5.689194193929357]
Large language models (LLMs) have shown superior capabilities in translating figurative language compared to neural machine translation (NMT) systems. This paper introduces two parallel datasets of sentences containing idiomatic expressions for Persian$rightarrow$English and English$rightarrow$Persian translations. We evaluate various open- and closed-source LLMs, NMT models, and their combinations. Experiments reveal that Claude-3.5-Sonnet delivers outstanding results in both translation directions.
arXiv Detail & Related papers (2024-12-13T09:29:27Z)
Language Models and Cycle Consistency for Self-Reflective Machine Translation [1.79487674052027]
We generate multiple translation candidates from a source language A to a target language B, and subsequently translate these candidates back to the original language A. By evaluating the cycle consistency between the original and back-translated sentences using metrics such as token-level precision and accuracy, we implicitly estimate the translation quality in language B. For each source sentence, we identify the translation candidate with optimal cycle consistency with the original sentence as the final answer.
arXiv Detail & Related papers (2024-11-05T04:01:41Z)
Decoupled Vocabulary Learning Enables Zero-Shot Translation from Unseen Languages [55.157295899188476]
neural machine translation systems learn to map sentences of different languages into a common representation space. In this work, we test this hypothesis by zero-shot translating from unseen languages. We demonstrate that this setup enables zero-shot translation from entirely unseen languages.
arXiv Detail & Related papers (2024-08-05T07:58:58Z)
Breaking the Script Barrier in Multilingual Pre-Trained Language Models with Transliteration-Based Post-Training Alignment [50.27950279695363]
The transfer performance is often hindered when a low-resource target language is written in a different script than the high-resource source language. Inspired by recent work that uses transliteration to address this problem, our paper proposes a transliteration-based post-pretraining alignment (PPA) method.
arXiv Detail & Related papers (2024-06-28T08:59:24Z)
Paying More Attention to Source Context: Mitigating Unfaithful Translations from Large Language Model [28.288949710191158]
Large language models (LLMs) have showcased impressive multilingual machine translation ability. Unlike encoder-decoder style models, decoder-only LLMs lack an explicit alignment between source and target contexts. We propose to encourage LLMs to pay more attention to the source context from both source and target perspectives.
arXiv Detail & Related papers (2024-06-11T07:49:04Z)
Building Accurate Translation-Tailored LLMs with Language Aware Instruction Tuning [57.323716555996114]
Off-target translation remains an unsolved problem, especially for low-resource languages. Recent works have either designed advanced prompting strategies to highlight the functionality of translation instructions or exploited the in-context learning ability of LLMs. In this work, we design a two-stage fine-tuning algorithm to improve the instruction-following ability (especially the translation direction) of LLMs.
arXiv Detail & Related papers (2024-03-21T13:47:40Z)
Crossing the Threshold: Idiomatic Machine Translation through Retrieval Augmentation and Loss Weighting [66.02718577386426]
We provide a simple characterization of idiomatic translation and related issues. We conduct a synthetic experiment revealing a tipping point at which transformer-based machine translation models correctly default to idiomatic translations. To improve translation of natural idioms, we introduce two straightforward yet effective techniques.
arXiv Detail & Related papers (2023-10-10T23:47:25Z)
Mitigating Hallucinations and Off-target Machine Translation with Source-Contrastive and Language-Contrastive Decoding [53.84948040596055]
We introduce two related methods to mitigate failure cases with a modified decoding objective. Experiments on the massively multilingual models M2M-100 (418M) and SMaLL-100 show that these methods suppress hallucinations and off-target translations.
arXiv Detail & Related papers (2023-09-13T17:15:27Z)
Translate to Disambiguate: Zero-shot Multilingual Word Sense Disambiguation with Pretrained Language Models [67.19567060894563]
Pretrained Language Models (PLMs) learn rich cross-lingual knowledge and can be finetuned to perform well on diverse tasks. We present a new study investigating how well PLMs capture cross-lingual word sense with Contextual Word-Level Translation (C-WLT) We find that as the model size increases, PLMs encode more cross-lingual word sense knowledge and better use context to improve WLT performance.
arXiv Detail & Related papers (2023-04-26T19:55:52Z)
Exposing Cross-Lingual Lexical Knowledge from Multilingual Sentence Encoders [85.80950708769923]
We probe multilingual language models for the amount of cross-lingual lexical knowledge stored in their parameters, and compare them against the original multilingual LMs. We also devise a novel method to expose this knowledge by additionally fine-tuning multilingual models. We report substantial gains on standard benchmarks.
arXiv Detail & Related papers (2022-04-30T13:23:16Z)
Does Transliteration Help Multilingual Language Modeling? [0.0]
We empirically measure the effect of transliteration on Multilingual Language Models. We focus on the Indic languages, which have the highest script diversity in the world. We find that transliteration benefits the low-resource languages without negatively affecting the comparatively high-resource languages.
arXiv Detail & Related papers (2022-01-29T05:48:42Z)
How do lexical semantics affect translation? An empirical study [1.0152838128195467]
A distinguishing factor of natural language is that words are typically ordered according to the rules of the grammar of a given language. We investigate how the word ordering of and lexical similarity between the source and target language affect translation performance.
arXiv Detail & Related papers (2021-12-31T23:28:28Z)
FILTER: An Enhanced Fusion Method for Cross-lingual Language Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning. During inference, the model makes predictions based on the text input in the target language and its translation in the source language. To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.