Assessing Visually-Continuous Corruption Robustness of Neural Networks
Relative to Human Performance
- URL: http://arxiv.org/abs/2402.19401v1
- Date: Thu, 29 Feb 2024 18:00:27 GMT
- Title: Assessing Visually-Continuous Corruption Robustness of Neural Networks
Relative to Human Performance
- Authors: Huakun Shen and Boyue Caroline Hu and Krzysztof Czarnecki and Lina
Marsso and Marsha Chechik
- Abstract summary: Neural Networks (NNs) have surpassed human accuracy in image classification on ImageNet.
NNs often lack robustness against image corruption, i.e., corruption robustness.
We propose visually-continuous corruption robustness (VCR) to allow assessing it over the wide and continuous range of changes that correspond to the human perceptive quality.
- Score: 6.254768374567899
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While Neural Networks (NNs) have surpassed human accuracy in image
classification on ImageNet, they often lack robustness against image
corruption, i.e., corruption robustness. Yet such robustness is seemingly
effortless for human perception. In this paper, we propose visually-continuous
corruption robustness (VCR) -- an extension of corruption robustness to allow
assessing it over the wide and continuous range of changes that correspond to
the human perceptive quality (i.e., from the original image to the full
distortion of all perceived visual information), along with two novel
human-aware metrics for NN evaluation. To compare VCR of NNs with human
perception, we conducted extensive experiments on 14 commonly used image
corruptions with 7,718 human participants and state-of-the-art robust NN models
with different training objectives (e.g., standard, adversarial, corruption
robustness), different architectures (e.g., convolution NNs, vision
transformers), and different amounts of training data augmentation. Our study
showed that: 1) assessing robustness against continuous corruption can reveal
insufficient robustness undetected by existing benchmarks; as a result, 2) the
gap between NN and human robustness is larger than previously known; and
finally, 3) some image corruptions have a similar impact on human perception,
offering opportunities for more cost-effective robustness assessments. Our
validation set with 14 image corruptions, human robustness data, and the
evaluation code is provided as a toolbox and a benchmark.
Related papers
- Frequency-Based Vulnerability Analysis of Deep Learning Models against
Image Corruptions [48.34142457385199]
We present MUFIA, an algorithm designed to identify the specific types of corruptions that can cause models to fail.
We find that even state-of-the-art models trained to be robust against known common corruptions struggle against the low visibility-based corruptions crafted by MUFIA.
arXiv Detail & Related papers (2023-06-12T15:19:13Z) - Investigating the Corruption Robustness of Image Classifiers with Random
Lp-norm Corruptions [3.1337872355726084]
This study investigates the use of random p-norm corruptions to augment the training and test data of image classifiers.
We find that training data augmentation with a combination of p-norm corruptions significantly improves corruption robustness, even on top of state-of-the-art data augmentation schemes.
arXiv Detail & Related papers (2023-05-09T12:45:43Z) - A Comprehensive Study on Robustness of Image Classification Models:
Benchmarking and Rethinking [54.89987482509155]
robustness of deep neural networks is usually lacking under adversarial examples, common corruptions, and distribution shifts.
We establish a comprehensive benchmark robustness called textbfARES-Bench on the image classification task.
By designing the training settings accordingly, we achieve the new state-of-the-art adversarial robustness.
arXiv Detail & Related papers (2023-02-28T04:26:20Z) - Benchmarking the Robustness of Deep Neural Networks to Common
Corruptions in Digital Pathology [11.398235052118608]
This benchmark is established to evaluate how deep neural networks perform on corrupted pathology images.
Two classification and one ranking metrics are designed to evaluate the prediction and confidence performance under corruption.
arXiv Detail & Related papers (2022-06-30T01:53:46Z) - On the Robustness of Quality Measures for GANs [136.18799984346248]
This work evaluates the robustness of quality measures of generative models such as Inception Score (IS) and Fr'echet Inception Distance (FID)
We show that such metrics can also be manipulated by additive pixel perturbations.
arXiv Detail & Related papers (2022-01-31T06:43:09Z) - Neural Architecture Dilation for Adversarial Robustness [56.18555072877193]
A shortcoming of convolutional neural networks is that they are vulnerable to adversarial attacks.
This paper aims to improve the adversarial robustness of the backbone CNNs that have a satisfactory accuracy.
Under a minimal computational overhead, a dilation architecture is expected to be friendly with the standard performance of the backbone CNN.
arXiv Detail & Related papers (2021-08-16T03:58:00Z) - Using the Overlapping Score to Improve Corruption Benchmarks [6.445605125467574]
We propose a metric called corruption overlapping score, which can be used to reveal flaws in corruption benchmarks.
We argue that taking into account overlappings between corruptions can help to improve existing benchmarks or build better ones.
arXiv Detail & Related papers (2021-05-26T06:42:54Z) - Improving robustness against common corruptions with frequency biased
models [112.65717928060195]
unseen image corruptions can cause a surprisingly large drop in performance.
Image corruption types have different characteristics in the frequency spectrum and would benefit from a targeted type of data augmentation.
We propose a new regularization scheme that minimizes the total variation (TV) of convolution feature-maps to increase high-frequency robustness.
arXiv Detail & Related papers (2021-03-30T10:44:50Z) - On Interaction Between Augmentations and Corruptions in Natural
Corruption Robustness [78.6626755563546]
Several new data augmentations have been proposed that significantly improve performance on ImageNet-C.
We develop a new measure in this space between augmentations and corruptions called the Minimal Sample Distance to demonstrate there is a strong correlation between similarity and performance.
We observe a significant degradation in corruption robustness when the test-time corruptions are sampled to be perceptually dissimilar from ImageNet-C.
Our results suggest that test error can be improved by training on perceptually similar augmentations, and data augmentations may not generalize well beyond the existing benchmark.
arXiv Detail & Related papers (2021-02-22T18:58:39Z) - A simple way to make neural networks robust against diverse image
corruptions [29.225922892332342]
We show that a simple but properly tuned training with additive Gaussian and Speckle noise generalizes surprisingly well to unseen corruptions.
An adversarial training of the recognition model against uncorrelated worst-case noise leads to an additional increase in performance.
arXiv Detail & Related papers (2020-01-16T20:10:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.