Related papers: Overinterpretation reveals image classification model pathologies

Overinterpretation reveals image classification model pathologies

URL: http://arxiv.org/abs/2003.08907v3
Date: Tue, 7 Dec 2021 16:38:50 GMT
Title: Overinterpretation reveals image classification model pathologies
Authors: Brandon Carter, Siddhartha Jain, Jonas Mueller, David Gifford
Abstract summary: convolutional neural networks (CNNs) on popular benchmarks exhibit troubling pathologies that allow them to display high accuracy even in the absence of semantically salient features. We demonstrate that neural networks trained on CIFAR-10 and ImageNet suffer from overinterpretation. Although these patterns portend potential model fragility in real-world deployment, they are in fact valid statistical patterns of the benchmark that alone suffice to attain high test accuracy.
Score: 15.950659318117694
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Image classifiers are typically scored on their test set accuracy, but high accuracy can mask a subtle type of model failure. We find that high scoring convolutional neural networks (CNNs) on popular benchmarks exhibit troubling pathologies that allow them to display high accuracy even in the absence of semantically salient features. When a model provides a high-confidence decision without salient supporting input features, we say the classifier has overinterpreted its input, finding too much class-evidence in patterns that appear nonsensical to humans. Here, we demonstrate that neural networks trained on CIFAR-10 and ImageNet suffer from overinterpretation, and we find models on CIFAR-10 make confident predictions even when 95% of input images are masked and humans cannot discern salient features in the remaining pixel-subsets. We introduce Batched Gradient SIS, a new method for discovering sufficient input subsets for complex datasets, and use this method to show the sufficiency of border pixels in ImageNet for training and testing. Although these patterns portend potential model fragility in real-world deployment, they are in fact valid statistical patterns of the benchmark that alone suffice to attain high test accuracy. Unlike adversarial examples, overinterpretation relies upon unmodified image pixels. We find ensembling and input dropout can each help mitigate overinterpretation.

Related papers

Bayesian generative models can flag performance loss, bias, and out-of-distribution image content [15.835055687646507]
Generative models are popular for medical imaging tasks such as anomaly detection, feature extraction, data visualization, or image generation. Since they are parameterized by deep learning models, they are often sensitive to distribution shifts and unreliable when applied to out-of-distribution data. We show how pixel-wise uncertainty can detect out-of-distribution image content such as ink, rulers, and patches.
arXiv Detail & Related papers (2025-03-21T18:45:28Z)
Match me if you can: Semi-Supervised Semantic Correspondence Learning with Unpaired Images [76.47980643420375]
This paper builds on the hypothesis that there is an inherent data-hungry matter in learning semantic correspondences. We demonstrate a simple machine annotator reliably enriches paired key points via machine supervision. Our models surpass current state-of-the-art models on semantic correspondence learning benchmarks like SPair-71k, PF-PASCAL, and PF-WILLOW.
arXiv Detail & Related papers (2023-11-30T13:22:15Z)
Hierarchical Uncertainty Estimation for Medical Image Segmentation Networks [1.9564356751775307]
Uncertainty exists in both images (noise) and manual annotations (human errors and bias) used for model training. We propose a simple yet effective method for estimating uncertainties at multiple levels. We demonstrate that a deep learning segmentation network such as U-net, can achieve a high segmentation performance.
arXiv Detail & Related papers (2023-08-16T16:09:23Z)
Zero-shot Model Diagnosis [80.36063332820568]
A common approach to evaluate deep learning models is to build a labeled test set with attributes of interest and assess how well it performs. This paper argues the case that Zero-shot Model Diagnosis (ZOOM) is possible without the need for a test set nor labeling.
arXiv Detail & Related papers (2023-03-27T17:59:33Z)
Traditional Classification Neural Networks are Good Generators: They are Competitive with DDPMs and GANs [104.72108627191041]
We show that conventional neural network classifiers can generate high-quality images comparable to state-of-the-art generative models. We propose a mask-based reconstruction module to make semantic gradients-aware to synthesize plausible images. We show that our method is also applicable to text-to-image generation by regarding image-text foundation models.
arXiv Detail & Related papers (2022-11-27T11:25:35Z)
Robustifying Deep Vision Models Through Shape Sensitization [19.118696557797957]
We propose a simple, lightweight adversarial augmentation technique that explicitly incentivizes the network to learn holistic shapes. Our augmentations superpose edgemaps from one image onto another image with shuffled patches, using a randomly determined mixing proportion. We show that our augmentations significantly improve classification accuracy and robustness measures on a range of datasets and neural architectures.
arXiv Detail & Related papers (2022-11-14T11:17:46Z)
Revisiting Sparse Convolutional Model for Visual Recognition [40.726494290922204]
This paper revisits the sparse convolutional modeling for image classification. We show that such models have equally strong empirical performance on CIFAR-10, CIFAR-100, and ImageNet datasets.
arXiv Detail & Related papers (2022-10-24T04:29:21Z)
Robust Sensible Adversarial Learning of Deep Neural Networks for Image Classification [6.594522185216161]
We introduce sensible adversarial learning and demonstrate the synergistic effect between pursuits of standard natural accuracy and robustness. Specifically, we define a sensible adversary which is useful for learning a robust model while keeping high natural accuracy. We propose a novel and efficient algorithm that trains a robust model using implicit loss truncation.
arXiv Detail & Related papers (2022-05-20T22:57:44Z)
Understanding out-of-distribution accuracies through quantifying difficulty of test samples [10.266928164137635]
Existing works show that although modern neural networks achieve remarkable generalization performance on the in-distribution (ID) dataset, the accuracy drops significantly on the out-of-distribution (OOD) datasets. We propose a new metric to quantify the difficulty of the test images (either ID or OOD) that depends on the interaction of the training dataset and the model.
arXiv Detail & Related papers (2022-03-28T21:13:41Z)
Image Quality Assessment using Contrastive Learning [50.265638572116984]
We train a deep Convolutional Neural Network (CNN) using a contrastive pairwise objective to solve the auxiliary problem. We show through extensive experiments that CONTRIQUE achieves competitive performance when compared to state-of-the-art NR image quality models. Our results suggest that powerful quality representations with perceptual relevance can be obtained without requiring large labeled subjective image quality datasets.
arXiv Detail & Related papers (2021-10-25T21:01:00Z)
Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle. In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize. Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z)
RAIN: A Simple Approach for Robust and Accurate Image Classification Networks [156.09526491791772]
It has been shown that the majority of existing adversarial defense methods achieve robustness at the cost of sacrificing prediction accuracy. This paper proposes a novel preprocessing framework, which we term Robust and Accurate Image classificatioN(RAIN) RAIN applies randomization over inputs to break the ties between the model forward prediction path and the backward gradient path, thus improving the model robustness. We conduct extensive experiments on the STL10 and ImageNet datasets to verify the effectiveness of RAIN against various types of adversarial attacks.
arXiv Detail & Related papers (2020-04-24T02:03:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.