Repairing Neural Networks by Leaving the Right Past Behind
- URL: http://arxiv.org/abs/2207.04806v1
- Date: Mon, 11 Jul 2022 12:07:39 GMT
- Title: Repairing Neural Networks by Leaving the Right Past Behind
- Authors: Ryutaro Tanno, Melanie F. Pradier, Aditya Nori, Yingzhen Li
- Abstract summary: Prediction failures of machine learning models often arise from deficiencies in training data.
This work develops a generic framework for both identifying training examples that have given rise to the target failure, and fixing the model through erasing information about them.
- Score: 23.78437548836594
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Prediction failures of machine learning models often arise from deficiencies
in training data, such as incorrect labels, outliers, and selection biases.
However, such data points that are responsible for a given failure mode are
generally not known a priori, let alone a mechanism for repairing the failure.
This work draws on the Bayesian view of continual learning, and develops a
generic framework for both, identifying training examples that have given rise
to the target failure, and fixing the model through erasing information about
them. This framework naturally allows leveraging recent advances in continual
learning to this new problem of model repairment, while subsuming the existing
works on influence functions and data deletion as specific instances.
Experimentally, the proposed approach outperforms the baselines for both
identification of detrimental training data and fixing model failures in a
generalisable manner.
Related papers
- RESTOR: Knowledge Recovery through Machine Unlearning [71.75834077528305]
Large language models trained on web-scale corpora can memorize undesirable datapoints.
Many machine unlearning methods have been proposed that aim to 'erase' these datapoints from trained models.
We propose the RESTOR framework for machine unlearning based on the following dimensions.
arXiv Detail & Related papers (2024-10-31T20:54:35Z) - SINDER: Repairing the Singular Defects of DINOv2 [61.98878352956125]
Vision Transformer models trained on large-scale datasets often exhibit artifacts in the patch token they extract.
We propose a novel fine-tuning smooth regularization that rectifies structural deficiencies using only a small dataset.
arXiv Detail & Related papers (2024-07-23T20:34:23Z) - Partially Blinded Unlearning: Class Unlearning for Deep Networks a Bayesian Perspective [4.31734012105466]
Machine Unlearning is the process of selectively discarding information designated to specific sets or classes of data from a pre-trained model.
We propose a methodology tailored for the purposeful elimination of information linked to a specific class of data from a pre-trained classification network.
Our novel approach, termed textbfPartially-Blinded Unlearning (PBU), surpasses existing state-of-the-art class unlearning methods, demonstrating superior effectiveness.
arXiv Detail & Related papers (2024-03-24T17:33:22Z) - Root Causing Prediction Anomalies Using Explainable AI [3.970146574042422]
We present a novel application of explainable AI (XAI) for root-causing performance degradation in machine learning models.
A single feature corruption can cause cascading feature, label and concept drifts.
We have successfully applied this technique to improve the reliability of models used in personalized advertising.
arXiv Detail & Related papers (2024-03-04T19:38:50Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Machine Unlearning of Features and Labels [72.81914952849334]
We propose first scenarios for unlearning and labels in machine learning models.
Our approach builds on the concept of influence functions and realizes unlearning through closed-form updates of model parameters.
arXiv Detail & Related papers (2021-08-26T04:42:24Z) - Graceful Degradation and Related Fields [0.0]
graceful degradation refers to the optimisation of model performance as it encounters out-of-distribution data.
This work presents a definition and discussion of graceful degradation and where it can be applied in deployed visual systems.
arXiv Detail & Related papers (2021-06-21T13:56:41Z) - Learning from others' mistakes: Avoiding dataset biases without modeling
them [111.17078939377313]
State-of-the-art natural language processing (NLP) models often learn to model dataset biases and surface form correlations instead of features that target the intended task.
Previous work has demonstrated effective methods to circumvent these issues when knowledge of the bias is available.
We show a method for training models that learn to ignore these problematic correlations.
arXiv Detail & Related papers (2020-12-02T16:10:54Z) - Understanding the Failure Modes of Out-of-Distribution Generalization [35.00563456450452]
Empirical studies suggest that machine learning models often rely on features, such as the background, that may be spuriously correlated with the label only during training time.
In this work, we identify the fundamental factors that give rise to this behavior, by explaining why models fail this way em even in easy-to-learn tasks.
arXiv Detail & Related papers (2020-10-29T17:19:03Z) - Accurate and Robust Feature Importance Estimation under Distribution
Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method.
We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.