Analyzing the Use of Influence Functions for Instance-Specific Data
Filtering in Neural Machine Translation
- URL: http://arxiv.org/abs/2210.13281v1
- Date: Mon, 24 Oct 2022 14:22:20 GMT
- Title: Analyzing the Use of Influence Functions for Instance-Specific Data
Filtering in Neural Machine Translation
- Authors: Tsz Kin Lam, Eva Hasler, Felix Hieber
- Abstract summary: Influence functions (IF) have been shown to be effective in finding relevant training examples for classification tasks.
We propose two effective extensions to a state of the art influence function and demonstrate on the sub-problem of copied training examples.
- Score: 2.990760778216954
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Customer feedback can be an important signal for improving commercial machine
translation systems. One solution for fixing specific translation errors is to
remove the related erroneous training instances followed by re-training of the
machine translation system, which we refer to as instance-specific data
filtering. Influence functions (IF) have been shown to be effective in finding
such relevant training examples for classification tasks such as image
classification, toxic speech detection and entailment task. Given a probing
instance, IF find influential training examples by measuring the similarity of
the probing instance with a set of training examples in gradient space. In this
work, we examine the use of influence functions for Neural Machine Translation
(NMT). We propose two effective extensions to a state of the art influence
function and demonstrate on the sub-problem of copied training examples that IF
can be applied more generally than handcrafted regular expressions.
Related papers
- How Far Can In-Context Alignment Go? Exploring the State of In-Context Alignment [48.0254056812898]
In-Context Learning (ICL) can align Large Language Models with human preferences known as In-Context Alignment (ICA)
We divide context text into three categories: format, system prompt, and example.
Our findings indicate that the example part is crucial for enhancing the model's alignment capabilities.
arXiv Detail & Related papers (2024-06-17T12:38:48Z) - In-context Examples Selection for Machine Translation [101.50473468507697]
Large-scale generative models show an impressive ability to perform a wide range of Natural Language Processing (NLP) tasks using in-context learning.
For Machine Translation (MT), these examples are typically randomly sampled from the development dataset with a similar distribution as the evaluation set.
We show that the translation quality and the domain of the in-context examples matter and that 1-shot noisy unrelated example can have a catastrophic impact on output quality.
arXiv Detail & Related papers (2022-12-05T17:25:15Z) - Phrase-level Adversarial Example Generation for Neural Machine
Translation [75.01476479100569]
We propose a phrase-level adversarial example generation (PAEG) method to enhance the robustness of the model.
We verify our method on three benchmarks, including LDC Chinese-English, IWSLT14 German-English, and WMT14 English-German tasks.
arXiv Detail & Related papers (2022-01-06T11:00:49Z) - Efficient Estimation of Influence of a Training Instance [56.29080605123304]
We propose an efficient method for estimating the influence of a training instance on a neural network model.
Our method is inspired by dropout, which zero-masks a sub-network and prevents the sub-network from learning each training instance.
We demonstrate that the proposed method can capture training influences, enhance the interpretability of error predictions, and cleanse the training dataset for improving generalization.
arXiv Detail & Related papers (2020-12-08T04:31:38Z) - Robust Neural Machine Translation: Modeling Orthographic and
Interpunctual Variation [3.3194866396158]
We propose a simple generative noise model to generate adversarial examples of ten different types.
We show that, when tested on noisy data, systems trained using adversarial examples perform almost as well as when translating clean data.
arXiv Detail & Related papers (2020-09-11T14:12:54Z) - The Impact of Indirect Machine Translation on Sentiment Classification [6.719549885077474]
We propose employing a machine translation (MT) system to translate customer feedback into another language.
As performing a direct translation is not always possible, we explore the performance of automatic classifiers on sentences that have been translated.
We conduct several experiments to analyse the performance of our proposed sentiment classification system and discuss the advantages and drawbacks of classifying translated sentences.
arXiv Detail & Related papers (2020-08-25T20:30:21Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z) - RelatIF: Identifying Explanatory Training Examples via Relative
Influence [13.87851325824883]
We use influence functions to identify relevant training examples that one might hope "explain" the predictions of a machine learning model.
We introduce RelatIF, a new class of criteria for choosing relevant training examples by way of an optimization objective that places a constraint on global influence.
In empirical evaluations, we find that the examples returned by RelatIF are more intuitive when compared to those found using influence functions.
arXiv Detail & Related papers (2020-03-25T20:59:54Z) - Robust Unsupervised Neural Machine Translation with Adversarial
Denoising Training [66.39561682517741]
Unsupervised neural machine translation (UNMT) has attracted great interest in the machine translation community.
The main advantage of the UNMT lies in its easy collection of required large training text sentences.
In this paper, we first time explicitly take the noisy data into consideration to improve the robustness of the UNMT based systems.
arXiv Detail & Related papers (2020-02-28T05:17:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.