Related papers: Hardness of Samples Is All You Need: Protecting Deep Learning Models Using Hardness of Samples

Hardness of Samples Is All You Need: Protecting Deep Learning Models Using Hardness of Samples

URL: http://arxiv.org/abs/2106.11424v1
Date: Mon, 21 Jun 2021 22:03:31 GMT
Title: Hardness of Samples Is All You Need: Protecting Deep Learning Models Using Hardness of Samples
Authors: Amir Mahdi Sadeghzadeh, Faezeh Dehghan, Amir Mohammad Sobhanian, and Rasool Jalili
Abstract summary: We show that the hardness degree of model extraction attacks samples is distinguishable from the hardness degree of normal samples. We propose Hardness-Oriented Detection Approach (HODA) to detect the sample sequences of model extraction attacks.
Score: 1.2074552857379273
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Several recent studies have shown that Deep Neural Network (DNN)-based classifiers are vulnerable against model extraction attacks. In model extraction attacks, an adversary exploits the target classifier to create a surrogate classifier imitating the target classifier with respect to some criteria. In this paper, we investigate the hardness degree of samples and demonstrate that the hardness degree histogram of model extraction attacks samples is distinguishable from the hardness degree histogram of normal samples. Normal samples come from the target classifier's training data distribution. As the training process of DNN-based classifiers is done in several epochs, we can consider this process as a sequence of subclassifiers so that each subclassifier is created at the end of an epoch. We use the sequence of subclassifiers to calculate the hardness degree of samples. We investigate the relation between hardness degree of samples and the trust in the classifier outputs. We propose Hardness-Oriented Detection Approach (HODA) to detect the sample sequences of model extraction attacks. The results demonstrate that HODA can detect the sample sequences of model extraction attacks with a high success rate by only watching 100 attack samples. We also investigate the hardness degree of adversarial examples and indicate that the hardness degree histogram of adversarial examples is distinct from the hardness degree histogram of normal samples.

Related papers

Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking [17.511240770486452]
Masked diffusion models (MDMs) have shown competitive performance compared to autoregressive models (ARMs) for language modeling.<n>We introduce EB-Sampler, a drop-in replacement for existing samplers, utilizing an Entropy Bounded unmasking procedure.<n> EB-Sampler accelerates sampling from current state of the art MDMs by roughly 2-3x on standard coding and math reasoning benchmarks without loss in performance.
arXiv Detail & Related papers (2025-05-30T17:52:55Z)
Detecting Stealthy Backdoor Samples based on Intra-class Distance for Large Language Models [12.519879298717104]
We propose a stealthy backdoor sample detection method based on Reference-Filtration and Tfidf-Clustering mechanisms.<n>Experiments on two machine translation datasets and one QA dataset demonstrate that RFTC outperforms baselines in backdoor detection and model performance.
arXiv Detail & Related papers (2025-05-29T02:49:29Z)
How Low Can You Go? Surfacing Prototypical In-Distribution Samples for Unsupervised Anomaly Detection [48.30283806131551]
We show that UAD with extremely few training samples can already match -- and in some cases even surpass -- the performance of training with the whole training dataset. We propose an unsupervised method to reliably identify prototypical samples to further boost UAD performance.
arXiv Detail & Related papers (2023-12-06T15:30:47Z)
VAESim: A probabilistic approach for self-supervised prototype discovery [0.23624125155742057]
We propose an architecture for image stratification based on a conditional variational autoencoder. We use a continuous latent space to represent the continuum of disorders and find clusters during training, which can then be used for image/patient stratification. We demonstrate that our method outperforms baselines in terms of kNN accuracy measured on a classification task against a standard VAE.
arXiv Detail & Related papers (2022-09-25T17:55:31Z)
DiscrimLoss: A Universal Loss for Hard Samples and Incorrect Samples Discrimination [28.599571524763785]
Given data with label noise (i.e., incorrect data), deep neural networks would gradually memorize the label noise and impair model performance. To relieve this issue, curriculum learning is proposed to improve model performance and generalization by ordering training samples in a meaningful sequence.
arXiv Detail & Related papers (2022-08-21T13:38:55Z)
Understanding, Detecting, and Separating Out-of-Distribution Samples and Adversarial Samples in Text Classification [80.81532239566992]
We compare the two types of anomalies (OOD and Adv samples) with the in-distribution (ID) ones from three aspects. We find that OOD samples expose their aberration starting from the first layer, while the abnormalities of Adv samples do not emerge until the deeper layers of the model. We propose a simple method to separate ID, OOD, and Adv samples using the hidden representations and output probabilities of the model.
arXiv Detail & Related papers (2022-04-09T12:11:59Z)
Boost Test-Time Performance with Closed-Loop Inference [85.43516360332646]
We propose to predict hard-classified test samples in a looped manner to boost the model performance. We first devise a filtering criterion to identify those hard-classified test samples that need additional inference loops. For each hard sample, we construct an additional auxiliary learning task based on its original top-$K$ predictions to calibrate the model.
arXiv Detail & Related papers (2022-03-21T10:20:21Z)
Density-Based Dynamic Curriculum Learning for Intent Detection [14.653917644725427]
Our model defines the sample's difficulty level according to their eigenvectors' density. We apply a dynamic curriculum learning strategy, which pays distinct attention to samples of various difficulty levels. Experiments on three open datasets verify that the proposed density-based algorithm can distinguish simple and complex samples significantly.
arXiv Detail & Related papers (2021-08-24T12:29:26Z)
Jo-SRC: A Contrastive Approach for Combating Noisy Labels [58.867237220886885]
We propose a noise-robust approach named Jo-SRC (Joint Sample Selection and Model Regularization based on Consistency) Specifically, we train the network in a contrastive learning manner. Predictions from two different views of each sample are used to estimate its "likelihood" of being clean or out-of-distribution.
arXiv Detail & Related papers (2021-03-24T07:26:07Z)
One for More: Selecting Generalizable Samples for Generalizable ReID Model [92.40951770273972]
This paper proposes a one-for-more training objective that takes the generalization ability of selected samples as a loss function. Our proposed one-for-more based sampler can be seamlessly integrated into the ReID training framework.
arXiv Detail & Related papers (2020-12-10T06:37:09Z)
Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle. In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize. Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.