Related papers: Class based Influence Functions for Error Detection

Class based Influence Functions for Error Detection

URL: http://arxiv.org/abs/2305.01384v1
Date: Tue, 2 May 2023 13:01:39 GMT
Title: Class based Influence Functions for Error Detection
Authors: Thang Nguyen-Duc, Hoang Thanh-Tung, Quan Hung Tran, Dang Huu-Tien, Hieu Ngoc Nguyen, Anh T. V. Dau, Nghi D. Q. Bui
Abstract summary: Influence functions (IFs) are unstable when applied to deep networks. We show that IFs are unreliable when the two data points belong to two different classes. Our solution leverages class information to improve the stability of IFs.
Score: 12.925739281660938
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Influence functions (IFs) are a powerful tool for detecting anomalous examples in large scale datasets. However, they are unstable when applied to deep networks. In this paper, we provide an explanation for the instability of IFs and develop a solution to this problem. We show that IFs are unreliable when the two data points belong to two different classes. Our solution leverages class information to improve the stability of IFs. Extensive experiments show that our modification significantly improves the performance and stability of IFs while incurring no additional computational cost.

Related papers

Attributing Data for Sharpness-Aware Minimization [4.924675851574611]
Sharpness-aware Minimization (SAM) improves generalization in large-scale model training by linking loss geometry to generalization.<n>However, challenges such as mislabeled noisy data and privacy concerns have emerged as significant issues.<n>We develop two innovative data valuation methods for SAM, each offering unique benefits in different scenarios.
arXiv Detail & Related papers (2025-07-05T14:46:42Z)
Rescaled Influence Functions: Accurate Data Attribution in High Dimension [6.812390750464419]
We present rescaled influence functions (RIF), a new tool for data attribution which can be used as a drop-in replacement for influence functions.<n>We compare IF and RIF on a range of real-world datasets, showing that RIFs offer significantly better predictions in practice.
arXiv Detail & Related papers (2025-06-07T04:19:21Z)
Deeper Understanding of Black-box Predictions via Generalized Influence Functions [6.649753747542211]
Influence functions (IFs) elucidate how data changes model. Increasing size and non- interpretity in large-scale models make IFs inaccurate. We introduce generalized IFs, precisely estimating target parameters' influence while nullifying nuisance gradient changes on fixed parameters.
arXiv Detail & Related papers (2023-12-09T14:17:12Z)
Spuriosity Didn't Kill the Classifier: Using Invariant Predictions to Harness Spurious Features [19.312258609611686]
Stable Feature Boosting (SFB) is an algorithm for learning a predictor that separates stable and conditionally-independent unstable features. We show that SFB can learn anally-optimal predictor without test-domain labels. Empirically, we demonstrate the effectiveness of SFB on real and synthetic data.
arXiv Detail & Related papers (2023-07-19T12:15:06Z)
Diversity-enhancing Generative Network for Few-shot Hypothesis Adaptation [135.80439360370556]
We propose a diversity-enhancing generative network (DEG-Net) for the FHA problem. It can generate diverse unlabeled data with the help of a kernel independence measure: the Hilbert-Schmidt independence criterion (HSIC)
arXiv Detail & Related papers (2023-07-12T06:29:02Z)
If Influence Functions are the Answer, Then What is the Question? [7.873458431535409]
Influence functions efficiently estimate the effect of removing a single training data point on a model's learned parameters. While influence estimates align well with leave-one-out retraining for linear models, recent works have shown this alignment is often poor in neural networks.
arXiv Detail & Related papers (2022-09-12T16:17:43Z)
SLA$^2$P: Self-supervised Anomaly Detection with Adversarial Perturbation [77.71161225100927]
Anomaly detection is a fundamental yet challenging problem in machine learning. We propose a novel and powerful framework, dubbed as SLA$2$P, for unsupervised anomaly detection.
arXiv Detail & Related papers (2021-11-25T03:53:43Z)
Stability of SGD: Tightness Analysis and Improved Bounds [8.831597193643628]
Gradient Descent (SGD) based methods have been widely used for training largescale machine learning models that generalize well in practice. This paper addresses the question: is analysis [18] tight for smooth functions, and if not, for what kind of loss and data can the analysis improved?
arXiv Detail & Related papers (2021-02-10T05:43:27Z)
FastIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging [112.19994766375231]
Influence functions approximate the 'influences' of training data-points for test predictions. We present FastIF, a set of simple modifications to influence functions that significantly improves their run-time. Our experiments demonstrate the potential of influence functions in model interpretation and correcting model errors.
arXiv Detail & Related papers (2020-12-31T18:02:34Z)
Active Class Incremental Learning for Imbalanced Datasets [10.680349952226935]
Incremental Learning (IL) allows AI systems to adapt to streamed data. Most existing algorithms make two strong hypotheses which reduce the realism of the incremental scenario. We introduce sample acquisition functions which tackle imbalance and are compatible with IL constraints.
arXiv Detail & Related papers (2020-08-25T12:47:09Z)
Estimating Structural Target Functions using Machine Learning and Influence Functions [103.47897241856603]
We propose a new framework for statistical machine learning of target functions arising as identifiable functionals from statistical models. This framework is problem- and model-agnostic and can be used to estimate a broad variety of target parameters of interest in applied statistics. We put particular focus on so-called coarsening at random/doubly robust problems with partially unobserved information.
arXiv Detail & Related papers (2020-08-14T16:48:29Z)
Influence Functions in Deep Learning Are Fragile [52.31375893260445]
influence functions approximate the effect of samples in test-time predictions. influence estimates are fairly accurate for shallow networks. Hessian regularization is important to get highquality influence estimates.
arXiv Detail & Related papers (2020-06-25T18:25:59Z)
The Curse of Performance Instability in Analysis Datasets: Consequences, Source, and Suggestions [93.62888099134028]
We find that the performance of state-of-the-art models on Natural Language Inference (NLI) and Reading (RC) analysis/stress sets can be highly unstable. This raises three questions: (1) How will the instability affect the reliability of the conclusions drawn based on these analysis sets? We give both theoretical explanations and empirical evidence regarding the source of the instability.
arXiv Detail & Related papers (2020-04-28T15:41:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.