Going Beyond Word Matching: Syntax Improves In-context Example Selection for Machine Translation
- URL: http://arxiv.org/abs/2403.19285v2
- Date: Wed, 29 May 2024 00:51:24 GMT
- Title: Going Beyond Word Matching: Syntax Improves In-context Example Selection for Machine Translation
- Authors: Chenming Tang, Zhixiang Wang, Yunfang Wu,
- Abstract summary: In-context learning (ICL) is the trending prompting strategy in the era of large language models (LLMs)
Previous works on in-context example selection for machine translation (MT) focus on superficial word-level features.
We propose a syntax-based in-context example selection method for MT, by computing the syntactic similarity between dependency trees.
- Score: 13.87098305304058
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In-context learning (ICL) is the trending prompting strategy in the era of large language models (LLMs), where a few examples are demonstrated to evoke LLMs' power for a given task. How to select informative examples remains an open issue. Previous works on in-context example selection for machine translation (MT) focus on superficial word-level features while ignoring deep syntax-level knowledge. In this paper, we propose a syntax-based in-context example selection method for MT, by computing the syntactic similarity between dependency trees using Polynomial Distance. In addition, we propose an ensemble strategy combining examples selected by both word-level and syntax-level criteria. Experimental results between English and 6 common languages indicate that syntax can effectively enhancing ICL for MT, obtaining the highest COMET scores on 11 out of 12 translation directions.
Related papers
- Understanding In-Context Machine Translation for Low-Resource Languages: A Case Study on Manchu [53.437954702561065]
In-context machine translation (MT) with large language models (LLMs) is a promising approach for low-resource MT.
This study systematically investigates how each resource and its quality affects the translation performance, with the Manchu language.
Our results indicate that high-quality dictionaries and good parallel examples are very helpful, while grammars hardly help.
arXiv Detail & Related papers (2025-02-17T14:53:49Z) - PromptRefine: Enhancing Few-Shot Performance on Low-Resource Indic Languages with Example Selection from Related Example Banks [57.86928556668849]
Large Language Models (LLMs) have recently demonstrated impressive few-shot learning capabilities through in-context learning (ICL)
ICL performance is highly dependent on the choice of few-shot demonstrations, making the selection of the most optimal examples a persistent research challenge.
In this work, we propose PromptRefine, a novel Alternating Minimization approach for example selection that improves ICL performance on low-resource Indic languages.
arXiv Detail & Related papers (2024-12-07T17:51:31Z) - SCOI: Syntax-augmented Coverage-based In-context Example Selection for Machine Translation [13.87098305304058]
In this work, we introduce syntactic knowledge to select better in-context examples for machine translation (MT)
We propose a new strategy, namely Syntax-augmented COverage-based In-context example selection (SCOI)
Our proposed SCOI obtains the highest average COMET score among all learning-free methods.
arXiv Detail & Related papers (2024-08-09T05:25:17Z) - In-Context Example Selection via Similarity Search Improves Low-Resource Machine Translation [20.704153242284114]
We focus on machine translation (MT), a task that has been shown to benefit from in-context translation examples.
No systematic studies have been published on how best to select examples, and mixed results have been reported on the usefulness of similarity-based selection.
We find that sentence embedding similarity can improve MT, especially for low-resource language directions.
arXiv Detail & Related papers (2024-08-01T09:07:32Z) - ParaICL: Towards Robust Parallel In-Context Learning [74.38022919598443]
Large language models (LLMs) have become the norm in natural language processing.
Few-shot in-context learning (ICL) relies on the choice of few-shot demonstration examples.
We propose a novel method named parallel in-context learning (ParaICL)
arXiv Detail & Related papers (2024-03-31T05:56:15Z) - Ungrammatical-syntax-based In-context Example Selection for Grammatical Error Correction [8.655807096424732]
In this paper, we propose a novel ungrammatical-syntax-based in-context example selection strategy for grammatical error correction.
Specifically, we measure similarity of sentences based on their syntactic structures with diverse algorithms, and identify optimal ICL examples sharing the most similar ill-formed syntax to the test input.
arXiv Detail & Related papers (2024-03-28T10:05:57Z) - $Se^2$: Sequential Example Selection for In-Context Learning [83.17038582333716]
Large language models (LLMs) for in-context learning (ICL) need to be activated by demonstration examples.
Prior work has extensively explored the selection of examples for ICL, predominantly following the "select then organize" paradigm.
In this paper, we formulate the problem as a $Se$quential $Se$lection problem and introduce $Se2$, a sequential-aware method.
arXiv Detail & Related papers (2024-02-21T15:35:04Z) - Cue-CoT: Chain-of-thought Prompting for Responding to In-depth Dialogue
Questions with LLMs [59.74002011562726]
We propose a novel linguistic cue-based chain-of-thoughts (textitCue-CoT) to provide a more personalized and engaging response.
We build a benchmark with in-depth dialogue questions, consisting of 6 datasets in both Chinese and English.
Empirical results demonstrate our proposed textitCue-CoT method outperforms standard prompting methods in terms of both textithelpfulness and textitacceptability on all datasets.
arXiv Detail & Related papers (2023-05-19T16:27:43Z) - Finding Support Examples for In-Context Learning [73.90376920653507]
We propose LENS, a fiLter-thEN-Search method to tackle this challenge in two stages.
First we filter the dataset to obtain informative in-context examples individually.
Then we propose diversity-guided example search which iteratively refines and evaluates the selected example permutations.
arXiv Detail & Related papers (2023-02-27T06:32:45Z) - Compositional Exemplars for In-context Learning [21.961094715261133]
Large pretrained language models (LMs) have shown impressive In-Context Learning (ICL) ability.
We propose CEIL (Compositional Exemplars for In-context Learning) to model the interaction between the given input and in-context examples.
We validate CEIL on 12 classification and generation datasets from 7 distinct NLP tasks, including sentiment analysis, paraphrase detection, natural language inference, commonsense reasoning, open-domain question answering, code generation, and semantic parsing.
arXiv Detail & Related papers (2023-02-11T14:02:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.