Detection of Adversarial Supports in Few-shot Classifiers Using Feature
Preserving Autoencoders and Self-Similarity
- URL: http://arxiv.org/abs/2012.06330v1
- Date: Wed, 9 Dec 2020 14:13:41 GMT
- Title: Detection of Adversarial Supports in Few-shot Classifiers Using Feature
Preserving Autoencoders and Self-Similarity
- Authors: Yi Xiang Marcus Tan, Penny Chong, Jiamei Sun, Yuval Elovici, Alexander
Binder
- Abstract summary: We propose a detection strategy to highlight adversarial support sets.
We make use of feature preserving autoencoder filtering and also the concept of self-similarity of a support set to perform this detection.
Our method is attack-agnostic and also the first to explore detection for few-shot classifiers to the best of our knowledge.
- Score: 89.26308254637702
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot classifiers excel under limited training samples, making it useful
in real world applications. However, the advent of adversarial samples
threatens the efficacy of such classifiers. For them to remain reliable,
defences against such attacks must be explored. However, closer examination to
prior literature reveals a big gap in this domain. Hence, in this work, we
propose a detection strategy to highlight adversarial support sets, aiming to
destroy a few-shot classifier's understanding of a certain class of objects. We
make use of feature preserving autoencoder filtering and also the concept of
self-similarity of a support set to perform this detection. As such, our method
is attack-agnostic and also the first to explore detection for few-shot
classifiers to the best of our knowledge. Our evaluation on the miniImagenet
and CUB datasets exhibit optimism when employing our proposed approach, showing
high AUROC scores for detection in general.
Related papers
- AdvQDet: Detecting Query-Based Adversarial Attacks with Adversarial Contrastive Prompt Tuning [93.77763753231338]
Adversarial Contrastive Prompt Tuning (ACPT) is proposed to fine-tune the CLIP image encoder to extract similar embeddings for any two intermediate adversarial queries.
We show that ACPT can detect 7 state-of-the-art query-based attacks with $>99%$ detection rate within 5 shots.
We also show that ACPT is robust to 3 types of adaptive attacks.
arXiv Detail & Related papers (2024-08-04T09:53:50Z) - Object-fabrication Targeted Attack for Object Detection [54.10697546734503]
adversarial attack for object detection contains targeted attack and untargeted attack.
New object-fabrication targeted attack mode can mislead detectors tofabricate extra false objects with specific target labels.
arXiv Detail & Related papers (2022-12-13T08:42:39Z) - MEAD: A Multi-Armed Approach for Evaluation of Adversarial Examples
Detectors [24.296350262025552]
We propose a novel framework, called MEAD, for evaluating detectors based on several attack strategies.
Among them, we make use of three new objectives to generate attacks.
The proposed performance metric is based on the worst-case scenario.
arXiv Detail & Related papers (2022-06-30T17:05:45Z) - Towards A Conceptually Simple Defensive Approach for Few-shot
classifiers Against Adversarial Support Samples [107.38834819682315]
We study a conceptually simple approach to defend few-shot classifiers against adversarial attacks.
We propose a simple attack-agnostic detection method, using the concept of self-similarity and filtering.
Our evaluation on the miniImagenet (MI) and CUB datasets exhibit good attack detection performance.
arXiv Detail & Related papers (2021-10-24T05:46:03Z) - Using Anomaly Feature Vectors for Detecting, Classifying and Warning of
Outlier Adversarial Examples [4.096598295525345]
We present DeClaW, a system for detecting, classifying, and warning of adversarial inputs presented to a classification neural network.
Preliminary findings suggest that AFVs can help distinguish among several types of adversarial attacks with close to 93% accuracy on the CIFAR-10 dataset.
arXiv Detail & Related papers (2021-07-01T16:00:09Z) - Learning to Separate Clusters of Adversarial Representations for Robust
Adversarial Detection [50.03939695025513]
We propose a new probabilistic adversarial detector motivated by a recently introduced non-robust feature.
In this paper, we consider the non-robust features as a common property of adversarial examples, and we deduce it is possible to find a cluster in representation space corresponding to the property.
This idea leads us to probability estimate distribution of adversarial representations in a separate cluster, and leverage the distribution for a likelihood based adversarial detector.
arXiv Detail & Related papers (2020-12-07T07:21:18Z) - ATRO: Adversarial Training with a Rejection Option [10.36668157679368]
This paper proposes a classification framework with a rejection option to mitigate the performance deterioration caused by adversarial examples.
Applying the adversarial training objective to both a classifier and a rejection function simultaneously, we can choose to abstain from classification when it has insufficient confidence to classify a test data point.
arXiv Detail & Related papers (2020-10-24T14:05:03Z) - Open-set Adversarial Defense [93.25058425356694]
We show that open-set recognition systems are vulnerable to adversarial attacks.
Motivated by this observation, we emphasize the need of an Open-Set Adrial Defense (OSAD) mechanism.
This paper proposes an Open-Set Defense Network (OSDN) as a solution to the OSAD problem.
arXiv Detail & Related papers (2020-09-02T04:35:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.