Detecting Adversarial Examples in Batches -- a geometrical approach
- URL: http://arxiv.org/abs/2206.08738v1
- Date: Fri, 17 Jun 2022 12:52:43 GMT
- Title: Detecting Adversarial Examples in Batches -- a geometrical approach
- Authors: Danush Kumar Venkatesh and Peter Steinbach
- Abstract summary: We introduce two geometric metrics, density and coverage, and evaluate their use in detecting adversarial samples in batches of unseen data.
We empirically study these metrics using MNIST and two real-world biomedical datasets from MedMNIST, subjected to two different adversarial attacks.
Our experiments show promising results for both metrics to detect adversarial examples.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Many deep learning methods have successfully solved complex tasks in computer
vision and speech recognition applications. Nonetheless, the robustness of
these models has been found to be vulnerable to perturbed inputs or adversarial
examples, which are imperceptible to the human eye, but lead the model to
erroneous output decisions. In this study, we adapt and introduce two geometric
metrics, density and coverage, and evaluate their use in detecting adversarial
samples in batches of unseen data. We empirically study these metrics using
MNIST and two real-world biomedical datasets from MedMNIST, subjected to two
different adversarial attacks. Our experiments show promising results for both
metrics to detect adversarial examples. We believe that his work can lay the
ground for further study on these metrics' use in deployed machine learning
systems to monitor for possible attacks by adversarial examples or related
pathologies such as dataset shift.
Related papers
- A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - Adversarial Attacks and Dimensionality in Text Classifiers [3.4179091429029382]
Adversarial attacks on machine learning algorithms have been a key deterrent to the adoption of AI in many real-world use cases.
We study adversarial examples in the field of natural language processing, specifically text classification tasks.
arXiv Detail & Related papers (2024-04-03T11:49:43Z) - On the Universal Adversarial Perturbations for Efficient Data-free
Adversarial Detection [55.73320979733527]
We propose a data-agnostic adversarial detection framework, which induces different responses between normal and adversarial samples to UAPs.
Experimental results show that our method achieves competitive detection performance on various text classification tasks.
arXiv Detail & Related papers (2023-06-27T02:54:07Z) - Exploring Model Dynamics for Accumulative Poisoning Discovery [62.08553134316483]
We propose a novel information measure, namely, Memorization Discrepancy, to explore the defense via the model-level information.
By implicitly transferring the changes in the data manipulation to that in the model outputs, Memorization Discrepancy can discover the imperceptible poison samples.
We thoroughly explore its properties and propose Discrepancy-aware Sample Correction (DSC) to defend against accumulative poisoning attacks.
arXiv Detail & Related papers (2023-06-06T14:45:24Z) - EMShepherd: Detecting Adversarial Samples via Side-channel Leakage [6.868995628617191]
Adversarial attacks have disastrous consequences for deep learning-empowered critical applications.
We propose a framework, EMShepherd, to capture electromagnetic traces of model execution, perform processing on traces and exploit them for adversarial detection.
We demonstrate that our air-gapped EMShepherd can effectively detect different adversarial attacks on a commonly used FPGA deep learning accelerator.
arXiv Detail & Related papers (2023-03-27T19:38:55Z) - Towards Generating Adversarial Examples on Mixed-type Data [32.41305735919529]
We propose a novel attack algorithm M-Attack, which can effectively generate adversarial examples in mixed-type data.
Based on M-Attack, attackers can attempt to mislead the targeted classification model's prediction, by only slightly perturbing both the numerical and categorical features in the given data samples.
Our generated adversarial examples can evade potential detection models, which makes the attack indeed insidious.
arXiv Detail & Related papers (2022-10-17T20:17:21Z) - Measuring the Contribution of Multiple Model Representations in
Detecting Adversarial Instances [0.0]
Our paper describes two approaches that incorporate representations from multiple models for detecting adversarial examples.
For many of the scenarios we consider, the results show that performance increases with the number of underlying models used for extracting representations.
arXiv Detail & Related papers (2021-11-13T04:24:57Z) - Explainable Adversarial Attacks in Deep Neural Networks Using Activation
Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples.
We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z) - Adversarial Examples for Unsupervised Machine Learning Models [71.81480647638529]
Adrial examples causing evasive predictions are widely used to evaluate and improve the robustness of machine learning models.
We propose a framework of generating adversarial examples for unsupervised models and demonstrate novel applications to data augmentation.
arXiv Detail & Related papers (2021-03-02T17:47:58Z) - Closeness and Uncertainty Aware Adversarial Examples Detection in
Adversarial Machine Learning [0.7734726150561088]
We explore and assess the usage of 2 different groups of metrics in detecting adversarial samples.
We introduce a new feature for adversarial detection, and we show that the performances of all these metrics heavily depend on the strength of the attack being used.
arXiv Detail & Related papers (2020-12-11T14:44:59Z) - On the Transferability of Adversarial Attacksagainst Neural Text
Classifier [121.6758865857686]
We investigate the transferability of adversarial examples for text classification models.
We propose a genetic algorithm to find an ensemble of models that can induce adversarial examples to fool almost all existing models.
We derive word replacement rules that can be used for model diagnostics from these adversarial examples.
arXiv Detail & Related papers (2020-11-17T10:45:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.