Related papers: Helpful or Harmful Data? Fine-tuning-free Shapley Attribution for Explaining Language Model Predictions

Helpful or Harmful Data? Fine-tuning-free Shapley Attribution for Explaining Language Model Predictions

URL: http://arxiv.org/abs/2406.04606v1
Date: Fri, 7 Jun 2024 03:29:57 GMT
Title: Helpful or Harmful Data? Fine-tuning-free Shapley Attribution for Explaining Language Model Predictions
Authors: Jingtan Wang, Xiaoqiang Lin, Rui Qiao, Chuan-Sheng Foo, Bryan Kian Hsiang Low,
Abstract summary: We propose a notion of robustness on the sign of the instance score. We introduce an efficient fine-tuning-free approximation of the Shapley value for instance attribution.
Score: 38.87540833773233
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The increasing complexity of foundational models underscores the necessity for explainability, particularly for fine-tuning, the most widely used training method for adapting models to downstream tasks. Instance attribution, one type of explanation, attributes the model prediction to each training example by an instance score. However, the robustness of instance scores, specifically towards dataset resampling, has been overlooked. To bridge this gap, we propose a notion of robustness on the sign of the instance score. We theoretically and empirically demonstrate that the popular leave-one-out-based methods lack robustness, while the Shapley value behaves significantly better, but at a higher computational cost. Accordingly, we introduce an efficient fine-tuning-free approximation of the Shapley value (FreeShap) for instance attribution based on the neural tangent kernel. We empirically demonstrate that FreeShap outperforms other methods for instance attribution and other data-centric applications such as data removal, data selection, and wrong label detection, and further generalize our scale to large language models (LLMs). Our code is available at https://github.com/JTWang2000/FreeShap.

Related papers

Nonparametric Data Attribution for Diffusion Models [57.820618036556084]
Data attribution for generative models seeks to quantify the influence of individual training examples on model outputs.<n>We propose a nonparametric attribution method that operates entirely on data, measuring influence via patch-level similarity between generated and training images.
arXiv Detail & Related papers (2025-10-16T03:37:16Z)
shapr: Explaining Machine Learning Models with Conditional Shapley Values in R and Python [0.6562256987706128]
shapr is a versatile tool for generating Shapley value based prediction explanations.<n> shaprpy Python library brings the core capabilities of shapr to the Python ecosystem.
arXiv Detail & Related papers (2025-04-02T15:47:30Z)
AMUN: Adversarial Machine UNlearning [13.776549741449557]
Adversarial Machine UNlearning (AMUN) outperforms prior state-of-the-art (SOTA) methods for image classification. AMUN lowers the confidence of the model on the forget samples by fine-tuning the model on their corresponding adversarial examples.
arXiv Detail & Related papers (2025-03-02T14:36:31Z)
On Model Extrapolation in Marginal Shapley Values [0.0]
One of the most popular methods for model explainability is based on Shapley values. marginal approach to calculating Shapley values leads to model extrapolation where it might not be well defined. We propose an approach which while using marginal averaging avoids model extrapolation and with addition of causal information replicates causal Shapley values.
arXiv Detail & Related papers (2024-12-17T18:33:14Z)
Data Shapley in One Training Run [88.59484417202454]
Data Shapley provides a principled framework for attributing data's contribution within machine learning contexts. Existing approaches require re-training models on different data subsets, which is computationally intensive. This paper introduces In-Run Data Shapley, which addresses these limitations by offering scalable data attribution for a target model of interest.
arXiv Detail & Related papers (2024-06-16T17:09:24Z)
Accelerated Shapley Value Approximation for Data Evaluation [3.707457963532597]
We show that Shapley value of data points can be approximated more efficiently by leveraging structural properties of machine learning problems. Our analysis suggests that in fact models trained on small subsets are more important in context of data valuation.
arXiv Detail & Related papers (2023-11-09T13:15:36Z)
An Efficient Shapley Value Computation for the Naive Bayes Classifier [0.0]
This article proposes an exact analytic expression of Shapley values in the case of the naive Bayes classifier. Results show that our Shapley proposal for the naive Bayes provides informative results with low algorithmic complexity.
arXiv Detail & Related papers (2023-07-31T14:39:10Z)
Efficient Shapley Values Estimation by Amortization for Text Classification [66.7725354593271]
We develop an amortized model that directly predicts each input feature's Shapley Value without additional model evaluations. Experimental results on two text classification datasets demonstrate that our amortized model estimates Shapley Values accurately with up to 60 times speedup.
arXiv Detail & Related papers (2023-05-31T16:19:13Z)
Robust Outlier Rejection for 3D Registration with Variational Bayes [70.98659381852787]
We develop a novel variational non-local network-based outlier rejection framework for robust alignment. We propose a voting-based inlier searching strategy to cluster the high-quality hypothetical inliers for transformation estimation.
arXiv Detail & Related papers (2023-04-04T03:48:56Z)
Beta Shapley: a Unified and Noise-reduced Data Valuation Framework for Machine Learning [13.66570363867102]
We propose Beta Shapley, which is a substantial generalization of Data Shapley. Beta Shapley unifies several popular data valuation methods and includes data Shapley as a special case. We demonstrate that Beta Shapley outperforms state-of-the-art data valuation methods on several downstream ML tasks.
arXiv Detail & Related papers (2021-10-26T22:03:55Z)
An Empirical Comparison of Instance Attribution Methods for NLP [62.63504976810927]
We evaluate the degree to which different potential instance attribution agree with respect to the importance of training samples. We find that simple retrieval methods yield training instances that differ from those identified via gradient-based methods.
arXiv Detail & Related papers (2021-04-09T01:03:17Z)
Causal Shapley Values: Exploiting Causal Knowledge to Explain Individual Predictions of Complex Models [6.423239719448169]
Shapley values are designed to attribute the difference between a model's prediction and an average baseline to the different features used as input to the model. We show how these 'causal' Shapley values can be derived for general causal graphs without sacrificing any of their desirable properties.
arXiv Detail & Related papers (2020-11-03T11:11:36Z)
Robust and On-the-fly Dataset Denoising for Image Classification [72.10311040730815]
On-the-fly Data Denoising (ODD) is robust to mislabeled examples, while introducing almost zero computational overhead compared to standard training. ODD is able to achieve state-of-the-art results on a wide range of datasets including real-world ones such as WebVision and Clothing1M.
arXiv Detail & Related papers (2020-03-24T03:59:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.