Related papers: Heat and Blur: An Effective and Fast Defense Against Adversarial Examples

Heat and Blur: An Effective and Fast Defense Against Adversarial Examples

URL: http://arxiv.org/abs/2003.07573v1
Date: Tue, 17 Mar 2020 08:11:18 GMT
Title: Heat and Blur: An Effective and Fast Defense Against Adversarial Examples
Authors: Haya Brama and Tal Grinshpoun
Abstract summary: We propose a simple defense that combines feature visualization with input modification. We use these heatmaps as a basis for our defense, in which the adversarial effects are corrupted by massive blurring. We also provide a new evaluation metric that can capture the effects of both attacks and defenses more thoroughly and descriptively.
Score: 2.2843885788439797
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The growing incorporation of artificial neural networks (NNs) into many fields, and especially into life-critical systems, is restrained by their vulnerability to adversarial examples (AEs). Some existing defense methods can increase NNs' robustness, but they often require special architecture or training procedures and are irrelevant to already trained models. In this paper, we propose a simple defense that combines feature visualization with input modification, and can, therefore, be applicable to various pre-trained networks. By reviewing several interpretability methods, we gain new insights regarding the influence of AEs on NNs' computation. Based on that, we hypothesize that information about the "true" object is preserved within the NN's activity, even when the input is adversarial, and present a feature visualization version that can extract that information in the form of relevance heatmaps. We then use these heatmaps as a basis for our defense, in which the adversarial effects are corrupted by massive blurring. We also provide a new evaluation metric that can capture the effects of both attacks and defenses more thoroughly and descriptively, and demonstrate the effectiveness of the defense and the utility of the suggested evaluation measurement with VGG19 results on the ImageNet dataset.

Related papers

Protecting Feed-Forward Networks from Adversarial Attacks Using Predictive Coding [0.20718016474717196]
An adversarial example is a modified input image designed to cause a Machine Learning (ML) model to make a mistake. This study presents a practical and effective solution -- using predictive coding networks (PCnets) as an auxiliary step for adversarial defence.
arXiv Detail & Related papers (2024-10-31T21:38:05Z)
Avoid Adversarial Adaption in Federated Learning by Multi-Metric Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources. FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks. We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously. MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z)
Adversarial Example Defense via Perturbation Grading Strategy [17.36107815256163]
Deep Neural Networks have been widely used in many fields. adversarial examples have tiny perturbations and greatly mislead the correct judgment of DNNs. Researchers have proposed various defense methods to protect DNNs. This paper assigns different defense strategies to adversarial perturbations of different strengths by grading the perturbations on the input examples.
arXiv Detail & Related papers (2022-12-16T08:35:21Z)
Evaluation of Neural Networks Defenses and Attacks using NDCG and Reciprocal Rank Metrics [6.6389732792316]
We present two metrics which are specifically designed to measure the effect of attacks, or the recovery effect of defenses, on the output of neural networks in classification tasks. Inspired by the normalized discounted cumulative gain and the reciprocal rank metrics used in information retrieval literature, we treat the neural network predictions as ranked lists of results. Compared to the common classification metrics, our proposed metrics demonstrate superior informativeness and distinctiveness.
arXiv Detail & Related papers (2022-01-10T12:54:45Z)
Salient Feature Extractor for Adversarial Defense on Deep Neural Networks [2.993911699314388]
Motivated by the observation that adversarial examples are due to the non-robust feature learned from the original dataset by models, we propose the concepts of salient feature(SF) and trivial feature(TF) We put forward a novel detection and defense method named salient feature extractor (SFE) to defend against adversarial attacks.
arXiv Detail & Related papers (2021-05-14T12:56:06Z)
Attack Agnostic Adversarial Defense via Visual Imperceptible Bound [70.72413095698961]
This research aims to design a defense model that is robust within a certain bound against both seen and unseen adversarial attacks. The proposed defense model is evaluated on the MNIST, CIFAR-10, and Tiny ImageNet databases. The proposed algorithm is attack agnostic, i.e. it does not require any knowledge of the attack algorithm.
arXiv Detail & Related papers (2020-10-25T23:14:26Z)
A Hamiltonian Monte Carlo Method for Probabilistic Adversarial Attack and Learning [122.49765136434353]
We present an effective method, called Hamiltonian Monte Carlo with Accumulated Momentum (HMCAM), aiming to generate a sequence of adversarial examples. We also propose a new generative method called Contrastive Adversarial Training (CAT), which approaches equilibrium distribution of adversarial examples. Both quantitative and qualitative analysis on several natural image datasets and practical systems have confirmed the superiority of the proposed algorithm.
arXiv Detail & Related papers (2020-10-15T16:07:26Z)
Information Obfuscation of Graph Neural Networks [96.8421624921384]
We study the problem of protecting sensitive attributes by information obfuscation when learning with graph structured data. We propose a framework to locally filter out pre-determined sensitive attributes via adversarial training with the total variation and the Wasserstein distance.
arXiv Detail & Related papers (2020-09-28T17:55:04Z)
Optimizing Information Loss Towards Robust Neural Networks [0.0]
Neural Networks (NNs) are vulnerable to adversarial examples. We present a new training approach we call textitentropic retraining. Based on an information-theoretic-inspired analysis, entropic retraining mimics the effects of adversarial training without the need of the laborious generation of adversarial examples.
arXiv Detail & Related papers (2020-08-07T10:12:31Z)
Graph Backdoor [53.70971502299977]
We present GTA, the first backdoor attack on graph neural networks (GNNs) GTA departs in significant ways: it defines triggers as specific subgraphs, including both topological structures and descriptive features. It can be instantiated for both transductive (e.g., node classification) and inductive (e.g., graph classification) tasks.
arXiv Detail & Related papers (2020-06-21T19:45:30Z)
A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems. This paper proposes a self-supervised adversarial training mechanism in the input space. It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.