Related papers: Assessing Visually-Continuous Corruption Robustness of Neural Networks Relative to Human Performance

Assessing Visually-Continuous Corruption Robustness of Neural Networks Relative to Human Performance

URL: http://arxiv.org/abs/2402.19401v1
Date: Thu, 29 Feb 2024 18:00:27 GMT
Title: Assessing Visually-Continuous Corruption Robustness of Neural Networks Relative to Human Performance
Authors: Huakun Shen and Boyue Caroline Hu and Krzysztof Czarnecki and Lina Marsso and Marsha Chechik
Abstract summary: Neural Networks (NNs) have surpassed human accuracy in image classification on ImageNet. NNs often lack robustness against image corruption, i.e., corruption robustness. We propose visually-continuous corruption robustness (VCR) to allow assessing it over the wide and continuous range of changes that correspond to the human perceptive quality.
Score: 6.254768374567899
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While Neural Networks (NNs) have surpassed human accuracy in image classification on ImageNet, they often lack robustness against image corruption, i.e., corruption robustness. Yet such robustness is seemingly effortless for human perception. In this paper, we propose visually-continuous corruption robustness (VCR) -- an extension of corruption robustness to allow assessing it over the wide and continuous range of changes that correspond to the human perceptive quality (i.e., from the original image to the full distortion of all perceived visual information), along with two novel human-aware metrics for NN evaluation. To compare VCR of NNs with human perception, we conducted extensive experiments on 14 commonly used image corruptions with 7,718 human participants and state-of-the-art robust NN models with different training objectives (e.g., standard, adversarial, corruption robustness), different architectures (e.g., convolution NNs, vision transformers), and different amounts of training data augmentation. Our study showed that: 1) assessing robustness against continuous corruption can reveal insufficient robustness undetected by existing benchmarks; as a result, 2) the gap between NN and human robustness is larger than previously known; and finally, 3) some image corruptions have a similar impact on human perception, offering opportunities for more cost-effective robustness assessments. Our validation set with 14 image corruptions, human robustness data, and the evaluation code is provided as a toolbox and a benchmark.

Related papers

Analysing the Robustness of Vision-Language-Models to Common Corruptions [2.9459935333120972]
Vision-language models (VLMs) have demonstrated impressive capabilities in understanding and reasoning about visual and textual content. We present the first comprehensive analysis of VLM robustness across 19 corruption types from the ImageNet-C benchmark. We introduce two new benchmarks, TextVQA-C and GQA-C, to evaluate how corruptions affect scene text understanding and object-based reasoning.
arXiv Detail & Related papers (2025-04-18T13:46:32Z)
Frequency-Based Vulnerability Analysis of Deep Learning Models against Image Corruptions [48.34142457385199]
We present MUFIA, an algorithm designed to identify the specific types of corruptions that can cause models to fail. We find that even state-of-the-art models trained to be robust against known common corruptions struggle against the low visibility-based corruptions crafted by MUFIA.
arXiv Detail & Related papers (2023-06-12T15:19:13Z)
Investigating the Corruption Robustness of Image Classifiers with Random Lp-norm Corruptions [3.1337872355726084]
This study investigates the use of random p-norm corruptions to augment the training and test data of image classifiers. We find that training data augmentation with a combination of p-norm corruptions significantly improves corruption robustness, even on top of state-of-the-art data augmentation schemes.
arXiv Detail & Related papers (2023-05-09T12:45:43Z)
A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking [54.89987482509155]
robustness of deep neural networks is usually lacking under adversarial examples, common corruptions, and distribution shifts. We establish a comprehensive benchmark robustness called textbfARES-Bench on the image classification task. By designing the training settings accordingly, we achieve the new state-of-the-art adversarial robustness.
arXiv Detail & Related papers (2023-02-28T04:26:20Z)
Benchmarking the Robustness of Deep Neural Networks to Common Corruptions in Digital Pathology [11.398235052118608]
This benchmark is established to evaluate how deep neural networks perform on corrupted pathology images. Two classification and one ranking metrics are designed to evaluate the prediction and confidence performance under corruption.
arXiv Detail & Related papers (2022-06-30T01:53:46Z)
On the Robustness of Quality Measures for GANs [136.18799984346248]
This work evaluates the robustness of quality measures of generative models such as Inception Score (IS) and Fr'echet Inception Distance (FID) We show that such metrics can also be manipulated by additive pixel perturbations.
arXiv Detail & Related papers (2022-01-31T06:43:09Z)
Neural Architecture Dilation for Adversarial Robustness [56.18555072877193]
A shortcoming of convolutional neural networks is that they are vulnerable to adversarial attacks. This paper aims to improve the adversarial robustness of the backbone CNNs that have a satisfactory accuracy. Under a minimal computational overhead, a dilation architecture is expected to be friendly with the standard performance of the backbone CNN.
arXiv Detail & Related papers (2021-08-16T03:58:00Z)
Using the Overlapping Score to Improve Corruption Benchmarks [6.445605125467574]
We propose a metric called corruption overlapping score, which can be used to reveal flaws in corruption benchmarks. We argue that taking into account overlappings between corruptions can help to improve existing benchmarks or build better ones.
arXiv Detail & Related papers (2021-05-26T06:42:54Z)
Improving robustness against common corruptions with frequency biased models [112.65717928060195]
unseen image corruptions can cause a surprisingly large drop in performance. Image corruption types have different characteristics in the frequency spectrum and would benefit from a targeted type of data augmentation. We propose a new regularization scheme that minimizes the total variation (TV) of convolution feature-maps to increase high-frequency robustness.
arXiv Detail & Related papers (2021-03-30T10:44:50Z)
On Interaction Between Augmentations and Corruptions in Natural Corruption Robustness [78.6626755563546]
Several new data augmentations have been proposed that significantly improve performance on ImageNet-C. We develop a new measure in this space between augmentations and corruptions called the Minimal Sample Distance to demonstrate there is a strong correlation between similarity and performance. We observe a significant degradation in corruption robustness when the test-time corruptions are sampled to be perceptually dissimilar from ImageNet-C. Our results suggest that test error can be improved by training on perceptually similar augmentations, and data augmentations may not generalize well beyond the existing benchmark.
arXiv Detail & Related papers (2021-02-22T18:58:39Z)
A simple way to make neural networks robust against diverse image corruptions [29.225922892332342]
We show that a simple but properly tuned training with additive Gaussian and Speckle noise generalizes surprisingly well to unseen corruptions. An adversarial training of the recognition model against uncorrelated worst-case noise leads to an additional increase in performance.
arXiv Detail & Related papers (2020-01-16T20:10:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.