Be Your Own Neighborhood: Detecting Adversarial Example by the
Neighborhood Relations Built on Self-Supervised Learning
- URL: http://arxiv.org/abs/2209.00005v1
- Date: Wed, 31 Aug 2022 08:18:44 GMT
- Title: Be Your Own Neighborhood: Detecting Adversarial Example by the
Neighborhood Relations Built on Self-Supervised Learning
- Authors: Zhiyuan He, Yijun Yang, Pin-Yu Chen, Qiang Xu, Tsung-Yi Ho
- Abstract summary: This paper presents a novel AE detection framework, named trustworthy for predictions.
performs the detection by distinguishing the AE's abnormal relation with its augmented versions.
An off-the-shelf Self-Supervised Learning (SSL) model is used to extract the representation and predict the label.
- Score: 64.78972193105443
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Neural Networks (DNNs) have achieved excellent performance in various
fields. However, DNNs' vulnerability to Adversarial Examples (AE) hinders their
deployments to safety-critical applications. This paper presents a novel AE
detection framework, named BEYOND, for trustworthy predictions. BEYOND performs
the detection by distinguishing the AE's abnormal relation with its augmented
versions, i.e. neighbors, from two prospects: representation similarity and
label consistency. An off-the-shelf Self-Supervised Learning (SSL) model is
used to extract the representation and predict the label for its highly
informative representation capacity compared to supervised learning models. For
clean samples, their representations and predictions are closely consistent
with their neighbors, whereas those of AEs differ greatly. Furthermore, we
explain this observation and show that by leveraging this discrepancy BEYOND
can effectively detect AEs. We develop a rigorous justification for the
effectiveness of BEYOND. Furthermore, as a plug-and-play model, BEYOND can
easily cooperate with the Adversarial Trained Classifier (ATC), achieving the
state-of-the-art (SOTA) robustness accuracy. Experimental results show that
BEYOND outperforms baselines by a large margin, especially under adaptive
attacks. Empowered by the robust relation net built on SSL, we found that
BEYOND outperforms baselines in terms of both detection ability and speed. Our
code will be publicly available.
Related papers
- Non-Linear Outlier Synthesis for Out-of-Distribution Detection [5.019613806273252]
We present NCIS, which enhances the quality of synthetic outliers by operating directly in the diffusion's model embedding space.
We demonstrate that these improvements yield new state-of-the-art OOD detection results on standard ImageNet100 and CIFAR100 benchmarks.
arXiv Detail & Related papers (2024-11-20T09:47:29Z) - Energy-based Out-of-Distribution Detection for Graph Neural Networks [76.0242218180483]
We propose a simple, powerful and efficient OOD detection model for GNN-based learning on graphs, which we call GNNSafe.
GNNSafe achieves up to $17.0%$ AUROC improvement over state-of-the-arts and it could serve as simple yet strong baselines in such an under-developed area.
arXiv Detail & Related papers (2023-02-06T16:38:43Z) - Detecting and Recovering Adversarial Examples from Extracting Non-robust
and Highly Predictive Adversarial Perturbations [15.669678743693947]
adversarial examples (AEs) are maliciously designed to fool target models.
Deep neural networks (DNNs) have been shown to be vulnerable against adversarial examples.
We propose a model-free AEs detection method, the whole process of which is free from querying the victim model.
arXiv Detail & Related papers (2022-06-30T08:48:28Z) - No Shifted Augmentations (NSA): compact distributions for robust
self-supervised Anomaly Detection [4.243926243206826]
Unsupervised Anomaly detection (AD) requires building a notion of normalcy, distinguishing in-distribution (ID) and out-of-distribution (OOD) data.
We investigate how the emph geometrical compactness of the ID feature distribution makes isolating and detecting outliers easier.
We propose novel architectural modifications to the self-supervised feature learning step, that enable such compact distributions for ID data to be learned.
arXiv Detail & Related papers (2022-03-19T15:55:32Z) - What You See is Not What the Network Infers: Detecting Adversarial
Examples Based on Semantic Contradiction [14.313178290347293]
Adversarial examples (AEs) pose severe threats to the applications of deep neural networks (DNNs) to safety-critical domains.
We propose a novel AE detection framework based on the very nature of AEs.
We show that ContraNet outperforms existing solutions by a large margin, especially under adaptive attacks.
arXiv Detail & Related papers (2022-01-24T13:15:31Z) - Discriminator-Free Generative Adversarial Attack [87.71852388383242]
Agenerative-based adversarial attacks can get rid of this limitation.
ASymmetric Saliency-based Auto-Encoder (SSAE) generates the perturbations.
The adversarial examples generated by SSAE not only make thewidely-used models collapse, but also achieves good visual quality.
arXiv Detail & Related papers (2021-07-20T01:55:21Z) - Exploring Robustness of Unsupervised Domain Adaptation in Semantic
Segmentation [74.05906222376608]
We propose adversarial self-supervision UDA (or ASSUDA) that maximizes the agreement between clean images and their adversarial examples by a contrastive loss in the output space.
This paper is rooted in two observations: (i) the robustness of UDA methods in semantic segmentation remains unexplored, which pose a security concern in this field; and (ii) although commonly used self-supervision (e.g., rotation and jigsaw) benefits image tasks such as classification and recognition, they fail to provide the critical supervision signals that could learn discriminative representation for segmentation tasks.
arXiv Detail & Related papers (2021-05-23T01:50:44Z) - WSSOD: A New Pipeline for Weakly- and Semi-Supervised Object Detection [75.80075054706079]
We propose a weakly- and semi-supervised object detection framework (WSSOD)
An agent detector is first trained on a joint dataset and then used to predict pseudo bounding boxes on weakly-annotated images.
The proposed framework demonstrates remarkable performance on PASCAL-VOC and MSCOCO benchmark, achieving a high performance comparable to those obtained in fully-supervised settings.
arXiv Detail & Related papers (2021-05-21T11:58:50Z) - Explaining and Improving Model Behavior with k Nearest Neighbor
Representations [107.24850861390196]
We propose using k nearest neighbor representations to identify training examples responsible for a model's predictions.
We show that kNN representations are effective at uncovering learned spurious associations.
Our results indicate that the kNN approach makes the finetuned model more robust to adversarial inputs.
arXiv Detail & Related papers (2020-10-18T16:55:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.