CTQScorer: Combining Multiple Features for In-context Example Selection
for Machine Translation
- URL: http://arxiv.org/abs/2305.14105v2
- Date: Sat, 21 Oct 2023 14:22:02 GMT
- Title: CTQScorer: Combining Multiple Features for In-context Example Selection
for Machine Translation
- Authors: Aswanth Kumar and Ratish Puduppully and Raj Dabre and Anoop
Kunchukuttan
- Abstract summary: We learn a regression model, CTQ Scorer, that selects examples based on multiple features in order to maximize the translation quality.
On multiple language pairs and language models, we show that CTQ Scorer helps significantly outperform random selection.
We also see an improvement of over 2.5 COMET points on average with respect to a strong BM25 retrieval-based baseline.
- Score: 22.700587969696933
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Large language models have demonstrated the capability to perform on machine
translation when the input is prompted with a few examples (in-context
learning). Translation quality depends on various features of the selected
examples, such as their quality and relevance, but previous work has
predominantly focused on individual features in isolation. In this paper, we
propose a general framework for combining different features influencing
example selection. We learn a regression model, CTQ Scorer (Contextual
Translation Quality), that selects examples based on multiple features in order
to maximize the translation quality. On multiple language pairs and language
models, we show that CTQ Scorer helps significantly outperform random selection
as well as strong single-factor baselines reported in the literature. We also
see an improvement of over 2.5 COMET points on average with respect to a strong
BM25 retrieval-based baseline.
Related papers
- A Fundamental Trade-off in Aligned Language Models and its Relation to Sampling Adaptors [50.046717886067555]
Given a general language model and its aligned version, there exists a trade-off between the average reward and average log-likelihood of the strings under the general language model.
We provide a formal treatment of this issue and demonstrate how a choice of sampling adaptor allows for a selection of how much likelihood we exchange for the reward.
arXiv Detail & Related papers (2024-06-14T17:38:21Z) - Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare [99.57567498494448]
We introduce Compare2Score, an all-around LMM-based no-reference IQA model.
During training, we generate scaled-up comparative instructions by comparing images from the same IQA dataset.
Experiments on nine IQA datasets validate that the Compare2Score effectively bridges text-defined comparative levels during training.
arXiv Detail & Related papers (2024-05-29T17:26:09Z) - To token or not to token: A Comparative Study of Text Representations
for Cross-Lingual Transfer [23.777874316083984]
We propose a scoring Language Quotient metric capable of providing a weighted representation of both zero-shot and few-shot evaluation combined.
Our analysis reveals that image-based models excel in cross-lingual transfer when languages are closely related and share visually similar scripts.
In dependency parsing tasks where word relationships play a crucial role, models with their character-level focus, outperform others.
arXiv Detail & Related papers (2023-10-12T06:59:10Z) - Multilingual Few-Shot Learning via Language Model Retrieval [18.465566186549072]
Transformer-based language models have achieved remarkable success in few-shot in-context learning.
We conduct a study of retrieving semantically similar few-shot samples and using them as the context.
We evaluate the proposed method on five natural language understanding datasets related to intent detection, question classification, sentiment analysis, and topic classification.
arXiv Detail & Related papers (2023-06-19T14:27:21Z) - Beyond Contrastive Learning: A Variational Generative Model for
Multilingual Retrieval [109.62363167257664]
We propose a generative model for learning multilingual text embeddings.
Our model operates on parallel data in $N$ languages.
We evaluate this method on a suite of tasks including semantic similarity, bitext mining, and cross-lingual question retrieval.
arXiv Detail & Related papers (2022-12-21T02:41:40Z) - In-context Examples Selection for Machine Translation [101.50473468507697]
Large-scale generative models show an impressive ability to perform a wide range of Natural Language Processing (NLP) tasks using in-context learning.
For Machine Translation (MT), these examples are typically randomly sampled from the development dataset with a similar distribution as the evaluation set.
We show that the translation quality and the domain of the in-context examples matter and that 1-shot noisy unrelated example can have a catastrophic impact on output quality.
arXiv Detail & Related papers (2022-12-05T17:25:15Z) - QAmeleon: Multilingual QA with Only 5 Examples [71.80611036543633]
We show how to leverage pre-trained language models under a few-shot learning setting.
Our approach, QAmeleon, uses a PLM to automatically generate multilingual data upon which QA models are trained.
Prompt tuning the PLM for data synthesis with only five examples per language delivers accuracy superior to translation-based baselines.
arXiv Detail & Related papers (2022-11-15T16:14:39Z) - Generative Language Models for Paragraph-Level Question Generation [79.31199020420827]
Powerful generative models have led to recent progress in question generation (QG)
It is difficult to measure advances in QG research since there are no standardized resources that allow a uniform comparison among approaches.
We introduce QG-Bench, a benchmark for QG that unifies existing question answering datasets by converting them to a standard QG setting.
arXiv Detail & Related papers (2022-10-08T10:24:39Z) - Multilingual Mix: Example Interpolation Improves Multilingual Neural
Machine Translation [45.77509642452541]
We introduce multilingual crossover encoder-decoder (mXEncDec) to fuse language pairs at an instance level.
Our approach interpolates instances from different language pairs into joint crossover examples' in order to encourage sharing input and output spaces across languages.
arXiv Detail & Related papers (2022-03-15T03:56:22Z) - Ensemble-based Transfer Learning for Low-resource Machine Translation
Quality Estimation [1.7188280334580195]
We focus on the Sentence-Level QE Shared Task of the Fifth Conference on Machine Translation (WMT20)
We propose an ensemble-based predictor-estimator QE model with transfer learning to overcome such QE data scarcity challenge.
We achieve the best performance on the ensemble model combining the models pretrained by individual languages as well as different levels of parallel trained corpus with a Pearson's correlation of 0.298.
arXiv Detail & Related papers (2021-05-17T06:02:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.