Distilling Model Failures as Directions in Latent Space
- URL: http://arxiv.org/abs/2206.14754v1
- Date: Wed, 29 Jun 2022 16:35:24 GMT
- Title: Distilling Model Failures as Directions in Latent Space
- Authors: Saachi Jain, Hannah Lawrence, Ankur Moitra, Aleksander Madry
- Abstract summary: We present a scalable method for automatically distilling a model's failure modes.
We harness linear classifiers to identify consistent error patterns, and induce a natural representation of these failure modes as directions within the feature space.
We demonstrate that this framework allows us to discover and automatically caption challenging subpopulations within the training dataset, and intervene to improve the model's performance on these subpopulations.
- Score: 87.30726685335098
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing methods for isolating hard subpopulations and spurious correlations
in datasets often require human intervention. This can make these methods
labor-intensive and dataset-specific. To address these shortcomings, we present
a scalable method for automatically distilling a model's failure modes.
Specifically, we harness linear classifiers to identify consistent error
patterns, and, in turn, induce a natural representation of these failure modes
as directions within the feature space. We demonstrate that this framework
allows us to discover and automatically caption challenging subpopulations
within the training dataset, and intervene to improve the model's performance
on these subpopulations. Code available at
https://github.com/MadryLab/failure-directions
Related papers
- Automatic Discovery and Assessment of Interpretable Systematic Errors in Semantic Segmentation [0.5242869847419834]
This paper presents a novel method for discovering systematic errors in segmentation models.
We leverage multimodal foundation models to retrieve errors and use conceptual linkage along with erroneous nature to study the systematic nature of these errors.
Our work opens up the avenue to model analysis and intervention that have so far been underexplored in semantic segmentation.
arXiv Detail & Related papers (2024-11-16T17:31:37Z) - Distributionally robust self-supervised learning for tabular data [2.942619386779508]
Learning robust representation in presence of error slices is challenging, due to high cardinality features and the complexity of constructing error sets.
Traditional robust representation learning methods are largely focused on improving worst group performance in supervised setting in computer vision.
Our approach utilizes an encoder-decoder model trained with Masked Language Modeling (MLM) loss to learn robust latent representations.
arXiv Detail & Related papers (2024-10-11T04:23:56Z) - What's the score? Automated Denoising Score Matching for Nonlinear Diffusions [25.062104976775448]
Reversing a diffusion process by learning its score forms the heart of diffusion-based generative modeling.
We introduce a family of tractable denoising score matching objectives, called local-DSM.
We show how local-DSM melded with Taylor expansions enables automated training and score estimation with nonlinear diffusion processes.
arXiv Detail & Related papers (2024-07-10T19:02:19Z) - Can LLMs Separate Instructions From Data? And What Do We Even Mean By That? [60.50127555651554]
Large Language Models (LLMs) show impressive results in numerous practical applications, but they lack essential safety features.
This makes them vulnerable to manipulations such as indirect prompt injections and generally unsuitable for safety-critical tasks.
We introduce a formal measure for instruction-data separation and an empirical variant that is calculable from a model's outputs.
arXiv Detail & Related papers (2024-03-11T15:48:56Z) - Root Causing Prediction Anomalies Using Explainable AI [3.970146574042422]
We present a novel application of explainable AI (XAI) for root-causing performance degradation in machine learning models.
A single feature corruption can cause cascading feature, label and concept drifts.
We have successfully applied this technique to improve the reliability of models used in personalized advertising.
arXiv Detail & Related papers (2024-03-04T19:38:50Z) - Self-Supervised Training with Autoencoders for Visual Anomaly Detection [61.62861063776813]
We focus on a specific use case in anomaly detection where the distribution of normal samples is supported by a lower-dimensional manifold.
We adapt a self-supervised learning regime that exploits discriminative information during training but focuses on the submanifold of normal examples.
We achieve a new state-of-the-art result on the MVTec AD dataset -- a challenging benchmark for visual anomaly detection in the manufacturing domain.
arXiv Detail & Related papers (2022-06-23T14:16:30Z) - Learning from others' mistakes: Avoiding dataset biases without modeling
them [111.17078939377313]
State-of-the-art natural language processing (NLP) models often learn to model dataset biases and surface form correlations instead of features that target the intended task.
Previous work has demonstrated effective methods to circumvent these issues when knowledge of the bias is available.
We show a method for training models that learn to ignore these problematic correlations.
arXiv Detail & Related papers (2020-12-02T16:10:54Z) - Understanding the Failure Modes of Out-of-Distribution Generalization [35.00563456450452]
Empirical studies suggest that machine learning models often rely on features, such as the background, that may be spuriously correlated with the label only during training time.
In this work, we identify the fundamental factors that give rise to this behavior, by explaining why models fail this way em even in easy-to-learn tasks.
arXiv Detail & Related papers (2020-10-29T17:19:03Z) - BREEDS: Benchmarks for Subpopulation Shift [98.90314444545204]
We develop a methodology for assessing the robustness of models to subpopulation shift.
We leverage the class structure underlying existing datasets to control the data subpopulations that comprise the training and test distributions.
Applying this methodology to the ImageNet dataset, we create a suite of subpopulation shift benchmarks of varying granularity.
arXiv Detail & Related papers (2020-08-11T17:04:47Z) - Evaluating the Disentanglement of Deep Generative Models through
Manifold Topology [66.06153115971732]
We present a method for quantifying disentanglement that only uses the generative model.
We empirically evaluate several state-of-the-art models across multiple datasets.
arXiv Detail & Related papers (2020-06-05T20:54:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.