Towards Evaluating the Robustness of Deep Diagnostic Models by
Adversarial Attack
- URL: http://arxiv.org/abs/2103.03438v1
- Date: Fri, 5 Mar 2021 02:24:47 GMT
- Title: Towards Evaluating the Robustness of Deep Diagnostic Models by
Adversarial Attack
- Authors: Mengting Xu, Tao Zhang, Zhongnian Li, Mingxia Liu, Daoqiang Zhang
- Abstract summary: Recent studies have shown deep diagnostic models may not be robust in the inference process.
Adversarial example is a well-designed perturbation that is not easily perceived by humans.
We have designed two new defense methods to handle adversarial examples in deep diagnostic models.
- Score: 38.480886577088384
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning models (with neural networks) have been widely used in
challenging tasks such as computer-aided disease diagnosis based on medical
images. Recent studies have shown deep diagnostic models may not be robust in
the inference process and may pose severe security concerns in clinical
practice. Among all the factors that make the model not robust, the most
serious one is adversarial examples. The so-called "adversarial example" is a
well-designed perturbation that is not easily perceived by humans but results
in a false output of deep diagnostic models with high confidence. In this
paper, we evaluate the robustness of deep diagnostic models by adversarial
attack. Specifically, we have performed two types of adversarial attacks to
three deep diagnostic models in both single-label and multi-label
classification tasks, and found that these models are not reliable when
attacked by adversarial example. We have further explored how adversarial
examples attack the models, by analyzing their quantitative classification
results, intermediate features, discriminability of features and correlation of
estimated labels for both original/clean images and those adversarial ones. We
have also designed two new defense methods to handle adversarial examples in
deep diagnostic models, i.e., Multi-Perturbations Adversarial Training (MPAdvT)
and Misclassification-Aware Adversarial Training (MAAdvT). The experimental
results have shown that the use of defense methods can significantly improve
the robustness of deep diagnostic models against adversarial attacks.
Related papers
- Unsupervised Model Diagnosis [49.36194740479798]
This paper proposes Unsupervised Model Diagnosis (UMO) to produce semantic counterfactual explanations without any user guidance.
Our approach identifies and visualizes changes in semantics, and then matches these changes to attributes from wide-ranging text sources.
arXiv Detail & Related papers (2024-10-08T17:59:03Z) - Towards Within-Class Variation in Alzheimer's Disease Detection from Spontaneous Speech [60.08015780474457]
Alzheimer's Disease (AD) detection has emerged as a promising research area that employs machine learning classification models.
We identify within-class variation as a critical challenge in AD detection: individuals with AD exhibit a spectrum of cognitive impairments.
We propose two novel methods: Soft Target Distillation (SoTD) and Instance-level Re-balancing (InRe), targeting two problems respectively.
arXiv Detail & Related papers (2024-09-22T02:06:05Z) - Evaluating the Adversarial Robustness of Semantic Segmentation: Trying Harder Pays Off [0.6554326244334868]
We argue that a good approximation of the sensitivity to adversarial perturbations requires significantly more effort than what is currently considered satisfactory.
We propose new attacks and combine them with the strongest attacks available in the literature.
Our results also demonstrate that a diverse set of strong attacks is necessary, because different models are often vulnerable to different attacks.
arXiv Detail & Related papers (2024-07-12T10:32:53Z) - On Evaluating Adversarial Robustness of Volumetric Medical Segmentation Models [59.45628259925441]
Volumetric medical segmentation models have achieved significant success on organ and tumor-based segmentation tasks.
Their vulnerability to adversarial attacks remains largely unexplored.
This underscores the importance of investigating the robustness of existing models.
arXiv Detail & Related papers (2024-06-12T17:59:42Z) - Adversarial Attacks and Dimensionality in Text Classifiers [3.4179091429029382]
Adversarial attacks on machine learning algorithms have been a key deterrent to the adoption of AI in many real-world use cases.
We study adversarial examples in the field of natural language processing, specifically text classification tasks.
arXiv Detail & Related papers (2024-04-03T11:49:43Z) - Model X-ray:Detecting Backdoored Models via Decision Boundary [62.675297418960355]
Backdoor attacks pose a significant security vulnerability for deep neural networks (DNNs)
We propose Model X-ray, a novel backdoor detection approach based on the analysis of illustrated two-dimensional (2D) decision boundaries.
Our approach includes two strategies focused on the decision areas dominated by clean samples and the concentration of label distribution.
arXiv Detail & Related papers (2024-02-27T12:42:07Z) - Measuring the Contribution of Multiple Model Representations in
Detecting Adversarial Instances [0.0]
Our paper describes two approaches that incorporate representations from multiple models for detecting adversarial examples.
For many of the scenarios we consider, the results show that performance increases with the number of underlying models used for extracting representations.
arXiv Detail & Related papers (2021-11-13T04:24:57Z) - On the Transferability of Adversarial Attacksagainst Neural Text
Classifier [121.6758865857686]
We investigate the transferability of adversarial examples for text classification models.
We propose a genetic algorithm to find an ensemble of models that can induce adversarial examples to fool almost all existing models.
We derive word replacement rules that can be used for model diagnostics from these adversarial examples.
arXiv Detail & Related papers (2020-11-17T10:45:05Z) - Detection Defense Against Adversarial Attacks with Saliency Map [7.736844355705379]
It is well established that neural networks are vulnerable to adversarial examples, which are almost imperceptible on human vision.
Existing defenses are trend to harden the robustness of models against adversarial attacks.
We propose a novel method combined with additional noises and utilize the inconsistency strategy to detect adversarial examples.
arXiv Detail & Related papers (2020-09-06T13:57:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.