Related papers: Analyzing the Use of Influence Functions for Instance-Specific Data Filtering in Neural Machine Translation

Analyzing the Use of Influence Functions for Instance-Specific Data Filtering in Neural Machine Translation

URL: http://arxiv.org/abs/2210.13281v1
Date: Mon, 24 Oct 2022 14:22:20 GMT
Title: Analyzing the Use of Influence Functions for Instance-Specific Data Filtering in Neural Machine Translation
Authors: Tsz Kin Lam, Eva Hasler, Felix Hieber
Abstract summary: Influence functions (IF) have been shown to be effective in finding relevant training examples for classification tasks. We propose two effective extensions to a state of the art influence function and demonstrate on the sub-problem of copied training examples.
Score: 2.990760778216954
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Customer feedback can be an important signal for improving commercial machine translation systems. One solution for fixing specific translation errors is to remove the related erroneous training instances followed by re-training of the machine translation system, which we refer to as instance-specific data filtering. Influence functions (IF) have been shown to be effective in finding such relevant training examples for classification tasks such as image classification, toxic speech detection and entailment task. Given a probing instance, IF find influential training examples by measuring the similarity of the probing instance with a set of training examples in gradient space. In this work, we examine the use of influence functions for Neural Machine Translation (NMT). We propose two effective extensions to a state of the art influence function and demonstrate on the sub-problem of copied training examples that IF can be applied more generally than handcrafted regular expressions.

Related papers

Influence Functions for Preference Dataset Pruning [0.6138671548064356]
In this work, we adapt the TL;DR dataset for reward model training to demonstrate how conjugate-gradient approximated influence functions can be used to filter datasets.<n>In our experiments, influence function filtering yields a small retraining accuracy uplift of 1.5% after removing 10% of training examples.<n>We also show that gradient similarity outperforms influence functions for detecting helpful training examples.
arXiv Detail & Related papers (2025-07-18T19:43:36Z)
How Far Can In-Context Alignment Go? Exploring the State of In-Context Alignment [48.0254056812898]
In-Context Learning (ICL) can align Large Language Models with human preferences known as In-Context Alignment (ICA) We divide context text into three categories: format, system prompt, and example. Our findings indicate that the example part is crucial for enhancing the model's alignment capabilities.
arXiv Detail & Related papers (2024-06-17T12:38:48Z)
In-context Examples Selection for Machine Translation [101.50473468507697]
Large-scale generative models show an impressive ability to perform a wide range of Natural Language Processing (NLP) tasks using in-context learning. For Machine Translation (MT), these examples are typically randomly sampled from the development dataset with a similar distribution as the evaluation set. We show that the translation quality and the domain of the in-context examples matter and that 1-shot noisy unrelated example can have a catastrophic impact on output quality.
arXiv Detail & Related papers (2022-12-05T17:25:15Z)
Phrase-level Adversarial Example Generation for Neural Machine Translation [75.01476479100569]
We propose a phrase-level adversarial example generation (PAEG) method to enhance the robustness of the model. We verify our method on three benchmarks, including LDC Chinese-English, IWSLT14 German-English, and WMT14 English-German tasks.
arXiv Detail & Related papers (2022-01-06T11:00:49Z)
Efficient Estimation of Influence of a Training Instance [56.29080605123304]
We propose an efficient method for estimating the influence of a training instance on a neural network model. Our method is inspired by dropout, which zero-masks a sub-network and prevents the sub-network from learning each training instance. We demonstrate that the proposed method can capture training influences, enhance the interpretability of error predictions, and cleanse the training dataset for improving generalization.
arXiv Detail & Related papers (2020-12-08T04:31:38Z)
Robust Neural Machine Translation: Modeling Orthographic and Interpunctual Variation [3.3194866396158]
We propose a simple generative noise model to generate adversarial examples of ten different types. We show that, when tested on noisy data, systems trained using adversarial examples perform almost as well as when translating clean data.
arXiv Detail & Related papers (2020-09-11T14:12:54Z)
The Impact of Indirect Machine Translation on Sentiment Classification [6.719549885077474]
We propose employing a machine translation (MT) system to translate customer feedback into another language. As performing a direct translation is not always possible, we explore the performance of automatic classifiers on sentences that have been translated. We conduct several experiments to analyse the performance of our proposed sentiment classification system and discuss the advantages and drawbacks of classifying translated sentences.
arXiv Detail & Related papers (2020-08-25T20:30:21Z)
Learning What Makes a Difference from Counterfactual Examples and Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks. We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task. Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
RelatIF: Identifying Explanatory Training Examples via Relative Influence [13.87851325824883]
We use influence functions to identify relevant training examples that one might hope "explain" the predictions of a machine learning model. We introduce RelatIF, a new class of criteria for choosing relevant training examples by way of an optimization objective that places a constraint on global influence. In empirical evaluations, we find that the examples returned by RelatIF are more intuitive when compared to those found using influence functions.
arXiv Detail & Related papers (2020-03-25T20:59:54Z)
Robust Unsupervised Neural Machine Translation with Adversarial Denoising Training [66.39561682517741]
Unsupervised neural machine translation (UNMT) has attracted great interest in the machine translation community. The main advantage of the UNMT lies in its easy collection of required large training text sentences. In this paper, we first time explicitly take the noisy data into consideration to improve the robustness of the UNMT based systems.
arXiv Detail & Related papers (2020-02-28T05:17:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.