Going Beyond Word Matching: Syntax Improves In-context Example Selection for Machine Translation
- URL: http://arxiv.org/abs/2403.19285v2
- Date: Wed, 29 May 2024 00:51:24 GMT
- Title: Going Beyond Word Matching: Syntax Improves In-context Example Selection for Machine Translation
- Authors: Chenming Tang, Zhixiang Wang, Yunfang Wu,
- Abstract summary: In-context learning (ICL) is the trending prompting strategy in the era of large language models (LLMs)
Previous works on in-context example selection for machine translation (MT) focus on superficial word-level features.
We propose a syntax-based in-context example selection method for MT, by computing the syntactic similarity between dependency trees.
- Score: 13.87098305304058
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In-context learning (ICL) is the trending prompting strategy in the era of large language models (LLMs), where a few examples are demonstrated to evoke LLMs' power for a given task. How to select informative examples remains an open issue. Previous works on in-context example selection for machine translation (MT) focus on superficial word-level features while ignoring deep syntax-level knowledge. In this paper, we propose a syntax-based in-context example selection method for MT, by computing the syntactic similarity between dependency trees using Polynomial Distance. In addition, we propose an ensemble strategy combining examples selected by both word-level and syntax-level criteria. Experimental results between English and 6 common languages indicate that syntax can effectively enhancing ICL for MT, obtaining the highest COMET scores on 11 out of 12 translation directions.
Related papers
- SCOI: Syntax-augmented Coverage-based In-context Example Selection for Machine Translation [13.87098305304058]
In this work, we introduce syntactic knowledge to select better in-context examples for machine translation (MT)
We propose a new strategy, namely Syntax-augmented COverage-based In-context example selection (SCOI)
Our proposed SCOI obtains the highest average COMET score among all learning-free methods.
arXiv Detail & Related papers (2024-08-09T05:25:17Z) - In-Context Example Selection via Similarity Search Improves Low-Resource Machine Translation [20.704153242284114]
We focus on machine translation (MT), a task that has been shown to benefit from in-context translation examples.
No systematic studies have been published on how best to select examples, and mixed results have been reported on the usefulness of similarity-based selection.
We find that sentence embedding similarity can improve MT, especially for low-resource language directions.
arXiv Detail & Related papers (2024-08-01T09:07:32Z) - Efficiently Exploring Large Language Models for Document-Level Machine Translation with In-context Learning [38.89119606657543]
In contrast to sentence-level translation, document-level translation (DOCMT) by large language models (LLMs) based on in-context learning faces two major challenges.
We propose a Context-Aware Prompting method (CAP) to generate more accurate, cohesive, and coherent translations via in-context learning.
We conduct extensive experiments across various DOCMT tasks, and the results demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-06-11T09:11:17Z) - ParaICL: Towards Robust Parallel In-Context Learning [74.38022919598443]
Large language models (LLMs) have become the norm in natural language processing.
Few-shot in-context learning (ICL) relies on the choice of few-shot demonstration examples.
We propose a novel method named parallel in-context learning (ParaICL)
arXiv Detail & Related papers (2024-03-31T05:56:15Z) - Ungrammatical-syntax-based In-context Example Selection for Grammatical Error Correction [8.655807096424732]
In this paper, we propose a novel ungrammatical-syntax-based in-context example selection strategy for grammatical error correction.
Specifically, we measure similarity of sentences based on their syntactic structures with diverse algorithms, and identify optimal ICL examples sharing the most similar ill-formed syntax to the test input.
arXiv Detail & Related papers (2024-03-28T10:05:57Z) - $Se^2$: Sequential Example Selection for In-Context Learning [83.17038582333716]
Large language models (LLMs) for in-context learning (ICL) need to be activated by demonstration examples.
Prior work has extensively explored the selection of examples for ICL, predominantly following the "select then organize" paradigm.
In this paper, we formulate the problem as a $Se$quential $Se$lection problem and introduce $Se2$, a sequential-aware method.
arXiv Detail & Related papers (2024-02-21T15:35:04Z) - kNN-ICL: Compositional Task-Oriented Parsing Generalization with Nearest
Neighbor In-Context Learning [50.40636157214161]
Task-Oriented Parsing (TOP) enables conversational assistants to interpret user commands expressed in natural language.
LLMs have achieved impressive performance in computer programs based on a natural language prompt.
This paper focuses on harnessing the capabilities of LLMs for semantic parsing tasks.
arXiv Detail & Related papers (2023-12-17T17:26:50Z) - Cue-CoT: Chain-of-thought Prompting for Responding to In-depth Dialogue
Questions with LLMs [59.74002011562726]
We propose a novel linguistic cue-based chain-of-thoughts (textitCue-CoT) to provide a more personalized and engaging response.
We build a benchmark with in-depth dialogue questions, consisting of 6 datasets in both Chinese and English.
Empirical results demonstrate our proposed textitCue-CoT method outperforms standard prompting methods in terms of both textithelpfulness and textitacceptability on all datasets.
arXiv Detail & Related papers (2023-05-19T16:27:43Z) - Finding Support Examples for In-Context Learning [73.90376920653507]
We propose LENS, a fiLter-thEN-Search method to tackle this challenge in two stages.
First we filter the dataset to obtain informative in-context examples individually.
Then we propose diversity-guided example search which iteratively refines and evaluates the selected example permutations.
arXiv Detail & Related papers (2023-02-27T06:32:45Z) - Compositional Exemplars for In-context Learning [21.961094715261133]
Large pretrained language models (LMs) have shown impressive In-Context Learning (ICL) ability.
We propose CEIL (Compositional Exemplars for In-context Learning) to model the interaction between the given input and in-context examples.
We validate CEIL on 12 classification and generation datasets from 7 distinct NLP tasks, including sentiment analysis, paraphrase detection, natural language inference, commonsense reasoning, open-domain question answering, code generation, and semantic parsing.
arXiv Detail & Related papers (2023-02-11T14:02:08Z) - Dynamic Context Selection for Document-level Neural Machine Translation
via Reinforcement Learning [55.18886832219127]
We propose an effective approach to select dynamic context for document-level translation.
A novel reward is proposed to encourage the selection and utilization of dynamic context sentences.
Experiments demonstrate that our approach can select adaptive context sentences for different source sentences.
arXiv Detail & Related papers (2020-10-09T01:05:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.