Towards Evaluating the Robustness of Deep Diagnostic Models by
Adversarial Attack
- URL: http://arxiv.org/abs/2103.03438v1
- Date: Fri, 5 Mar 2021 02:24:47 GMT
- Title: Towards Evaluating the Robustness of Deep Diagnostic Models by
Adversarial Attack
- Authors: Mengting Xu, Tao Zhang, Zhongnian Li, Mingxia Liu, Daoqiang Zhang
- Abstract summary: Recent studies have shown deep diagnostic models may not be robust in the inference process.
Adversarial example is a well-designed perturbation that is not easily perceived by humans.
We have designed two new defense methods to handle adversarial examples in deep diagnostic models.
- Score: 38.480886577088384
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning models (with neural networks) have been widely used in
challenging tasks such as computer-aided disease diagnosis based on medical
images. Recent studies have shown deep diagnostic models may not be robust in
the inference process and may pose severe security concerns in clinical
practice. Among all the factors that make the model not robust, the most
serious one is adversarial examples. The so-called "adversarial example" is a
well-designed perturbation that is not easily perceived by humans but results
in a false output of deep diagnostic models with high confidence. In this
paper, we evaluate the robustness of deep diagnostic models by adversarial
attack. Specifically, we have performed two types of adversarial attacks to
three deep diagnostic models in both single-label and multi-label
classification tasks, and found that these models are not reliable when
attacked by adversarial example. We have further explored how adversarial
examples attack the models, by analyzing their quantitative classification
results, intermediate features, discriminability of features and correlation of
estimated labels for both original/clean images and those adversarial ones. We
have also designed two new defense methods to handle adversarial examples in
deep diagnostic models, i.e., Multi-Perturbations Adversarial Training (MPAdvT)
and Misclassification-Aware Adversarial Training (MAAdvT). The experimental
results have shown that the use of defense methods can significantly improve
the robustness of deep diagnostic models against adversarial attacks.
Related papers
- Evaluating the Adversarial Robustness of Semantic Segmentation: Trying Harder Pays Off [0.6554326244334868]
We argue that a good approximation of the sensitivity to adversarial perturbations requires significantly more effort than what is currently considered satisfactory.
We propose new attacks and combine them with the strongest attacks available in the literature.
Our results also demonstrate that a diverse set of strong attacks is necessary, because different models are often vulnerable to different attacks.
arXiv Detail & Related papers (2024-07-12T10:32:53Z) - On Evaluating Adversarial Robustness of Volumetric Medical Segmentation Models [59.45628259925441]
Volumetric medical segmentation models have achieved significant success on organ and tumor-based segmentation tasks.
Their vulnerability to adversarial attacks remains largely unexplored.
This underscores the importance of investigating the robustness of existing models.
arXiv Detail & Related papers (2024-06-12T17:59:42Z) - Adversarial Attacks and Dimensionality in Text Classifiers [3.4179091429029382]
Adversarial attacks on machine learning algorithms have been a key deterrent to the adoption of AI in many real-world use cases.
We study adversarial examples in the field of natural language processing, specifically text classification tasks.
arXiv Detail & Related papers (2024-04-03T11:49:43Z) - On Evaluating the Adversarial Robustness of Semantic Segmentation Models [0.0]
A number of adversarial training approaches have been proposed as a defense against adversarial perturbation.
We show for the first time that a number of models in previous work that are claimed to be robust are in fact not robust at all.
We then evaluate simple adversarial training algorithms that produce reasonably robust models even under our set of strong attacks.
arXiv Detail & Related papers (2023-06-25T11:45:08Z) - Adversarial Attack and Defense for Medical Image Analysis: Methods and
Applications [57.206139366029646]
We present a comprehensive survey on advances in adversarial attack and defense for medical image analysis.
We provide a unified theoretical framework for different types of adversarial attack and defense methods for medical image analysis.
For a fair comparison, we establish a new benchmark for adversarially robust medical diagnosis models.
arXiv Detail & Related papers (2023-03-24T16:38:58Z) - Semantic Image Attack for Visual Model Diagnosis [80.36063332820568]
In practice, metric analysis on a specific train and test dataset does not guarantee reliable or fair ML models.
This paper proposes Semantic Image Attack (SIA), a method based on the adversarial attack that provides semantic adversarial images.
arXiv Detail & Related papers (2023-03-23T03:13:04Z) - Measuring the Contribution of Multiple Model Representations in
Detecting Adversarial Instances [0.0]
Our paper describes two approaches that incorporate representations from multiple models for detecting adversarial examples.
For many of the scenarios we consider, the results show that performance increases with the number of underlying models used for extracting representations.
arXiv Detail & Related papers (2021-11-13T04:24:57Z) - ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine
Learning Models [64.03398193325572]
Inference attacks against Machine Learning (ML) models allow adversaries to learn about training data, model parameters, etc.
We concentrate on four attacks - namely, membership inference, model inversion, attribute inference, and model stealing.
Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models.
arXiv Detail & Related papers (2021-02-04T11:35:13Z) - On the Transferability of Adversarial Attacksagainst Neural Text
Classifier [121.6758865857686]
We investigate the transferability of adversarial examples for text classification models.
We propose a genetic algorithm to find an ensemble of models that can induce adversarial examples to fool almost all existing models.
We derive word replacement rules that can be used for model diagnostics from these adversarial examples.
arXiv Detail & Related papers (2020-11-17T10:45:05Z) - Detection Defense Against Adversarial Attacks with Saliency Map [7.736844355705379]
It is well established that neural networks are vulnerable to adversarial examples, which are almost imperceptible on human vision.
Existing defenses are trend to harden the robustness of models against adversarial attacks.
We propose a novel method combined with additional noises and utilize the inconsistency strategy to detect adversarial examples.
arXiv Detail & Related papers (2020-09-06T13:57:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.