Utilizing Network Properties to Detect Erroneous Inputs
- URL: http://arxiv.org/abs/2002.12520v3
- Date: Fri, 24 Mar 2023 19:01:41 GMT
- Title: Utilizing Network Properties to Detect Erroneous Inputs
- Authors: Matt Gorbett, Nathaniel Blanchard
- Abstract summary: We train a linear SVM classifier to detect erroneous data using hidden and softmax feature vectors of pre-trained neural networks.
Our results indicate that these faulty data types generally exhibit linearly separable activation properties from correct examples.
We experimentally validate our findings across a diverse range of datasets, domains, pre-trained models, and adversarial attacks.
- Score: 0.76146285961466
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural networks are vulnerable to a wide range of erroneous inputs such as
adversarial, corrupted, out-of-distribution, and misclassified examples. In
this work, we train a linear SVM classifier to detect these four types of
erroneous data using hidden and softmax feature vectors of pre-trained neural
networks. Our results indicate that these faulty data types generally exhibit
linearly separable activation properties from correct examples, giving us the
ability to reject bad inputs with no extra training or overhead. We
experimentally validate our findings across a diverse range of datasets,
domains, pre-trained models, and adversarial attacks.
Related papers
- SINDER: Repairing the Singular Defects of DINOv2 [61.98878352956125]
Vision Transformer models trained on large-scale datasets often exhibit artifacts in the patch token they extract.
We propose a novel fine-tuning smooth regularization that rectifies structural deficiencies using only a small dataset.
arXiv Detail & Related papers (2024-07-23T20:34:23Z) - Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs.
We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD.
We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z) - How adversarial attacks can disrupt seemingly stable accurate classifiers [76.95145661711514]
Adversarial attacks dramatically change the output of an otherwise accurate learning system using a seemingly inconsequential modification to a piece of input data.
Here, we show that this may be seen as a fundamental feature of classifiers working with high dimensional input data.
We introduce a simple generic and generalisable framework for which key behaviours observed in practical systems arise with high probability.
arXiv Detail & Related papers (2023-09-07T12:02:00Z) - DI-NIDS: Domain Invariant Network Intrusion Detection System [9.481792073140204]
In various applications, such as computer vision, domain adaptation techniques have been successful.
In the case of network intrusion detection however, the state-of-the-art domain adaptation approaches have had limited success.
We propose to extract domain invariant features using adversarial domain adaptation from multiple network domains.
arXiv Detail & Related papers (2022-10-15T10:26:22Z) - VPN: Verification of Poisoning in Neural Networks [11.221552724154988]
We study another neural network security issue, namely data poisoning.
In this case an attacker inserts a trigger into a subset of the training data, in such a way that at test time, this trigger in an input causes the trained model to misclassify to some target class.
We show how to formulate the check for data poisoning as a property that can be checked with off-the-shelf verification tools.
arXiv Detail & Related papers (2022-05-08T15:16:05Z) - Efficient and Robust Classification for Sparse Attacks [34.48667992227529]
We consider perturbations bounded by the $ell$--norm, which have been shown as effective attacks in the domains of image-recognition, natural language processing, and malware-detection.
We propose a novel defense method that consists of "truncation" and "adrial training"
Motivated by the insights we obtain, we extend these components to neural network classifiers.
arXiv Detail & Related papers (2022-01-23T21:18:17Z) - Feature Encoding with AutoEncoders for Weakly-supervised Anomaly
Detection [46.76220474310698]
Weakly-supervised anomaly detection aims at learning an anomaly detector from a limited amount of labeled data and abundant unlabeled data.
Recent works build deep neural networks for anomaly detection by discriminatively mapping the normal samples and abnormal samples to different regions in the feature space or fitting different distributions.
This paper proposes a novel strategy to transform the input data into a more meaningful representation that could be used for anomaly detection.
arXiv Detail & Related papers (2021-05-22T16:23:05Z) - Attribute-Guided Adversarial Training for Robustness to Natural
Perturbations [64.35805267250682]
We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space.
Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations.
arXiv Detail & Related papers (2020-12-03T10:17:30Z) - On the Transferability of Adversarial Attacksagainst Neural Text
Classifier [121.6758865857686]
We investigate the transferability of adversarial examples for text classification models.
We propose a genetic algorithm to find an ensemble of models that can induce adversarial examples to fool almost all existing models.
We derive word replacement rules that can be used for model diagnostics from these adversarial examples.
arXiv Detail & Related papers (2020-11-17T10:45:05Z) - Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle.
In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize.
Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z) - Robust Variational Autoencoder for Tabular Data with Beta Divergence [0.0]
We propose a robust variational autoencoder with mixed categorical and continuous features.
Our results on the anomaly detection application for network traffic datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-06-15T08:09:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.