PRIME: Prioritizing Interpretability in Failure Mode Extraction
- URL: http://arxiv.org/abs/2310.00164v2
- Date: Thu, 14 Mar 2024 04:29:54 GMT
- Title: PRIME: Prioritizing Interpretability in Failure Mode Extraction
- Authors: Keivan Rezaei, Mehrdad Saberi, Mazda Moayeri, Soheil Feizi,
- Abstract summary: We study the challenge of providing human-understandable descriptions for failure modes in trained image classification models.
We propose a novel approach that prioritizes interpretability in this problem.
Our method successfully identifies failure modes and generates high-quality text descriptions associated with them.
- Score: 49.93565079216376
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we study the challenge of providing human-understandable descriptions for failure modes in trained image classification models. Existing works address this problem by first identifying clusters (or directions) of incorrectly classified samples in a latent space and then aiming to provide human-understandable text descriptions for them. We observe that in some cases, describing text does not match well with identified failure modes, partially owing to the fact that shared interpretable attributes of failure modes may not be captured using clustering in the feature space. To improve on these shortcomings, we propose a novel approach that prioritizes interpretability in this problem: we start by obtaining human-understandable concepts (tags) of images in the dataset and then analyze the model's behavior based on the presence or absence of combinations of these tags. Our method also ensures that the tags describing a failure mode form a minimal set, avoiding redundant and noisy descriptions. Through several experiments on different datasets, we show that our method successfully identifies failure modes and generates high-quality text descriptions associated with them. These results highlight the importance of prioritizing interpretability in understanding model failures.
Related papers
- What could go wrong? Discovering and describing failure modes in computer vision [27.6114923305978]
We formalize the problem of Language-Based Error Explainability (LBEE)
We propose solutions that operate in a joint vision-and-language embedding space.
We show that the proposed methodology isolates nontrivial sentences associated with specific error causes.
arXiv Detail & Related papers (2024-08-08T14:01:12Z) - Progressive Alignment with VLM-LLM Feature to Augment Defect Classification for the ASE Dataset [7.1083241462091165]
Traditional defect classification approaches are facing with two barriers.
Insufficient training data and unstable data quality.
We propose the special dataset, including rich data description recorded on image, for defect classification, but the defect feature is uneasy to learn directly.
arXiv Detail & Related papers (2024-04-08T04:17:27Z) - Understanding and Mitigating Classification Errors Through Interpretable
Token Patterns [58.91023283103762]
Characterizing errors in easily interpretable terms gives insight into whether a classifier is prone to making systematic errors.
We propose to discover those patterns of tokens that distinguish correct and erroneous predictions.
We show that our method, Premise, performs well in practice.
arXiv Detail & Related papers (2023-11-18T00:24:26Z) - Conjunct Resolution in the Face of Verbal Omissions [51.220650412095665]
We propose a conjunct resolution task that operates directly on the text and makes use of a split-and-rephrase paradigm in order to recover the missing elements in the coordination structure.
We curate a large dataset, containing over 10K examples of naturally-occurring verbal omissions with crowd-sourced annotations.
We train various neural baselines for this task, and show that while our best method obtains decent performance, it leaves ample space for improvement.
arXiv Detail & Related papers (2023-05-26T08:44:02Z) - Resolving label uncertainty with implicit posterior models [71.62113762278963]
We propose a method for jointly inferring labels across a collection of data samples.
By implicitly assuming the existence of a generative model for which a differentiable predictor is the posterior, we derive a training objective that allows learning under weak beliefs.
arXiv Detail & Related papers (2022-02-28T18:09:44Z) - Label-Descriptive Patterns and their Application to Characterizing
Classification Errors [31.272875287136426]
State-of-the-art deep learning methods achieve human-like performance on many tasks, but make errors nevertheless.
Characterizing these errors in easily interpretable terms gives insight into whether a model is prone to making systematic errors, but also gives a way to act and improve the model.
In this paper we propose a method that allows us to do so for arbitrary classifiers by mining a small set of patterns that together succinctly describe the input data that is partitioned according to correctness of prediction.
arXiv Detail & Related papers (2021-10-18T19:42:21Z) - Semi-supervised Semantic Segmentation with Directional Context-aware
Consistency [66.49995436833667]
We focus on the semi-supervised segmentation problem where only a small set of labeled data is provided with a much larger collection of totally unlabeled images.
A preferred high-level representation should capture the contextual information while not losing self-awareness.
We present the Directional Contrastive Loss (DC Loss) to accomplish the consistency in a pixel-to-pixel manner.
arXiv Detail & Related papers (2021-06-27T03:42:40Z) - Understanding Failures of Deep Networks via Robust Feature Extraction [44.204907883776045]
We introduce and study a method aimed at characterizing and explaining failures by identifying visual attributes whose presence or absence results in poor performance.
We leverage the representation of a separate robust model to extract interpretable features and then harness these features to identify failure modes.
arXiv Detail & Related papers (2020-12-03T08:33:29Z) - Automatically Discovering and Learning New Visual Categories with
Ranking Statistics [145.89790963544314]
We tackle the problem of discovering novel classes in an image collection given labelled examples of other classes.
We learn a general-purpose clustering model and use the latter to identify the new classes in the unlabelled data.
We evaluate our approach on standard classification benchmarks and outperform current methods for novel category discovery by a significant margin.
arXiv Detail & Related papers (2020-02-13T18:53:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.