Related papers: Finding and Fixing Spurious Patterns with Explanations

Finding and Fixing Spurious Patterns with Explanations

URL: http://arxiv.org/abs/2106.02112v1
Date: Thu, 3 Jun 2021 20:07:46 GMT
Title: Finding and Fixing Spurious Patterns with Explanations
Authors: Gregory Plumb, Marco Tulio Ribeiro, Ameet Talwalkar
Abstract summary: We present an end-to-end pipeline for identifying and mitigating spurious patterns for image classifiers. We find patterns such as "the model's prediction for tennis racket changes 63% of the time if we hide the people" Then, if a pattern is spurious, we mitigate it via a novel form of data augmentation.
Score: 14.591545536354621
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Machine learning models often use spurious patterns such as "relying on the presence of a person to detect a tennis racket," which do not generalize. In this work, we present an end-to-end pipeline for identifying and mitigating spurious patterns for image classifiers. We start by finding patterns such as "the model's prediction for tennis racket changes 63% of the time if we hide the people." Then, if a pattern is spurious, we mitigate it via a novel form of data augmentation. We demonstrate that this approach identifies a diverse set of spurious patterns and that it mitigates them by producing a model that is both more accurate on a distribution where the spurious pattern is not helpful and more robust to distribution shift.

Related papers

One-for-More: Continual Diffusion Model for Anomaly Detection [61.12622458367425]
Anomaly detection methods utilize diffusion models to generate or reconstruct normal samples when given arbitrary anomaly images. Our study found that the diffusion model suffers from severe faithfulness hallucination'' and catastrophic forgetting'' We propose a continual diffusion model that uses gradient projection to achieve stable continual learning.
arXiv Detail & Related papers (2025-02-27T07:47:27Z)
Towards Pattern-aware Data Augmentation for Temporal Knowledge Graph Completion [18.51546761241817]
We introduce Booster, the first data augmentation strategy for temporal knowledge graphs. We propose a hierarchical scoring algorithm based on triadic closures within TKGs. We also propose a two-stage training approach to identify samples that deviate from the model's preferred patterns.
arXiv Detail & Related papers (2024-12-31T03:47:19Z)
Learning to Jump: Thinning and Thickening Latent Counts for Generative Modeling [69.60713300418467]
Learning to jump is a general recipe for generative modeling of various types of data. We demonstrate when learning to jump is expected to perform comparably to learning to denoise, and when it is expected to perform better.
arXiv Detail & Related papers (2023-05-28T05:38:28Z)
On the Limitations of Model Stealing with Uncertainty Quantification Models [5.389383754665319]
Model stealing aims at inferring a victim model's functionality at a fraction of the original training cost. In practice the model's architecture, weight dimension, and original training data can not be determined exactly. We generate multiple possible networks and combine their predictions to improve the quality of the stolen model.
arXiv Detail & Related papers (2023-05-09T09:31:04Z)
DiffPattern: Layout Pattern Generation via Discrete Diffusion [16.148506119712735]
We propose toolDiffPattern to generate reliable layout patterns. Our experiments on several benchmark settings show that toolDiffPattern significantly outperforms existing baselines.
arXiv Detail & Related papers (2023-03-23T06:16:14Z)
Right for the Right Latent Factors: Debiasing Generative Models via Disentanglement [20.41752850243945]
Key assumption of most statistical machine learning methods is that they have access to independent samples from the distribution of data they encounter at test time. In particular, machine learning models have been shown to exhibit Clever-Hans-like behaviour, meaning that spurious correlations in the training set are inadvertently learnt. We propose to debias generative models by disentangling their internal representations, which is achieved via human feedback.
arXiv Detail & Related papers (2022-02-01T13:16:18Z)
Probabilistic Modeling for Human Mesh Recovery [73.11532990173441]
This paper focuses on the problem of 3D human reconstruction from 2D evidence. We recast the problem as learning a mapping from the input to a distribution of plausible 3D poses.
arXiv Detail & Related papers (2021-08-26T17:55:11Z)
Generative Models as Distributions of Functions [72.2682083758999]
Generative models are typically trained on grid-like data such as images. In this paper, we abandon discretized grids and instead parameterize individual data points by continuous functions.
arXiv Detail & Related papers (2021-02-09T11:47:55Z)
Unsupervised Noisy Tracklet Person Re-identification [100.85530419892333]
We present a novel selective tracklet learning (STL) approach that can train discriminative person re-id models from unlabelled tracklet data. This avoids the tedious and costly process of exhaustively labelling person image/tracklet true matching pairs across camera views. Our method is particularly more robust against arbitrary noisy data of raw tracklets therefore scalable to learning discriminative models from unconstrained tracking data.
arXiv Detail & Related papers (2021-01-16T07:31:00Z)
When Hearst Is not Enough: Improving Hypernymy Detection from Corpus with Distributional Models [59.46552488974247]
This paper addresses whether an is-a relationship exists between words (x, y) with the help of large textual corpora. Recent studies suggest that pattern-based ones are superior, if large-scale Hearst pairs are extracted and fed, with the sparsity of unseen (x, y) pairs relieved. For the first time, this paper quantifies the non-negligible existence of those specific cases. We also demonstrate that distributional methods are ideal to make up for pattern-based ones in such cases.
arXiv Detail & Related papers (2020-10-10T08:34:19Z)
Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle. In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize. Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.