Hardness of Samples Is All You Need: Protecting Deep Learning Models
Using Hardness of Samples
- URL: http://arxiv.org/abs/2106.11424v1
- Date: Mon, 21 Jun 2021 22:03:31 GMT
- Title: Hardness of Samples Is All You Need: Protecting Deep Learning Models
Using Hardness of Samples
- Authors: Amir Mahdi Sadeghzadeh, Faezeh Dehghan, Amir Mohammad Sobhanian, and
Rasool Jalili
- Abstract summary: We show that the hardness degree of model extraction attacks samples is distinguishable from the hardness degree of normal samples.
We propose Hardness-Oriented Detection Approach (HODA) to detect the sample sequences of model extraction attacks.
- Score: 1.2074552857379273
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Several recent studies have shown that Deep Neural Network (DNN)-based
classifiers are vulnerable against model extraction attacks. In model
extraction attacks, an adversary exploits the target classifier to create a
surrogate classifier imitating the target classifier with respect to some
criteria. In this paper, we investigate the hardness degree of samples and
demonstrate that the hardness degree histogram of model extraction attacks
samples is distinguishable from the hardness degree histogram of normal
samples. Normal samples come from the target classifier's training data
distribution. As the training process of DNN-based classifiers is done in
several epochs, we can consider this process as a sequence of subclassifiers so
that each subclassifier is created at the end of an epoch. We use the sequence
of subclassifiers to calculate the hardness degree of samples. We investigate
the relation between hardness degree of samples and the trust in the classifier
outputs. We propose Hardness-Oriented Detection Approach (HODA) to detect the
sample sequences of model extraction attacks. The results demonstrate that HODA
can detect the sample sequences of model extraction attacks with a high success
rate by only watching 100 attack samples. We also investigate the hardness
degree of adversarial examples and indicate that the hardness degree histogram
of adversarial examples is distinct from the hardness degree histogram of
normal samples.
Related papers
- How Low Can You Go? Surfacing Prototypical In-Distribution Samples for Unsupervised Anomaly Detection [48.30283806131551]
We show that UAD with extremely few training samples can already match -- and in some cases even surpass -- the performance of training with the whole training dataset.
We propose an unsupervised method to reliably identify prototypical samples to further boost UAD performance.
arXiv Detail & Related papers (2023-12-06T15:30:47Z) - VAESim: A probabilistic approach for self-supervised prototype discovery [0.23624125155742057]
We propose an architecture for image stratification based on a conditional variational autoencoder.
We use a continuous latent space to represent the continuum of disorders and find clusters during training, which can then be used for image/patient stratification.
We demonstrate that our method outperforms baselines in terms of kNN accuracy measured on a classification task against a standard VAE.
arXiv Detail & Related papers (2022-09-25T17:55:31Z) - DiscrimLoss: A Universal Loss for Hard Samples and Incorrect Samples
Discrimination [28.599571524763785]
Given data with label noise (i.e., incorrect data), deep neural networks would gradually memorize the label noise and impair model performance.
To relieve this issue, curriculum learning is proposed to improve model performance and generalization by ordering training samples in a meaningful sequence.
arXiv Detail & Related papers (2022-08-21T13:38:55Z) - Understanding, Detecting, and Separating Out-of-Distribution Samples and
Adversarial Samples in Text Classification [80.81532239566992]
We compare the two types of anomalies (OOD and Adv samples) with the in-distribution (ID) ones from three aspects.
We find that OOD samples expose their aberration starting from the first layer, while the abnormalities of Adv samples do not emerge until the deeper layers of the model.
We propose a simple method to separate ID, OOD, and Adv samples using the hidden representations and output probabilities of the model.
arXiv Detail & Related papers (2022-04-09T12:11:59Z) - Boost Test-Time Performance with Closed-Loop Inference [85.43516360332646]
We propose to predict hard-classified test samples in a looped manner to boost the model performance.
We first devise a filtering criterion to identify those hard-classified test samples that need additional inference loops.
For each hard sample, we construct an additional auxiliary learning task based on its original top-$K$ predictions to calibrate the model.
arXiv Detail & Related papers (2022-03-21T10:20:21Z) - Density-Based Dynamic Curriculum Learning for Intent Detection [14.653917644725427]
Our model defines the sample's difficulty level according to their eigenvectors' density.
We apply a dynamic curriculum learning strategy, which pays distinct attention to samples of various difficulty levels.
Experiments on three open datasets verify that the proposed density-based algorithm can distinguish simple and complex samples significantly.
arXiv Detail & Related papers (2021-08-24T12:29:26Z) - Jo-SRC: A Contrastive Approach for Combating Noisy Labels [58.867237220886885]
We propose a noise-robust approach named Jo-SRC (Joint Sample Selection and Model Regularization based on Consistency)
Specifically, we train the network in a contrastive learning manner. Predictions from two different views of each sample are used to estimate its "likelihood" of being clean or out-of-distribution.
arXiv Detail & Related papers (2021-03-24T07:26:07Z) - One for More: Selecting Generalizable Samples for Generalizable ReID
Model [92.40951770273972]
This paper proposes a one-for-more training objective that takes the generalization ability of selected samples as a loss function.
Our proposed one-for-more based sampler can be seamlessly integrated into the ReID training framework.
arXiv Detail & Related papers (2020-12-10T06:37:09Z) - Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle.
In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize.
Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.