Understanding, Detecting, and Separating Out-of-Distribution Samples and
Adversarial Samples in Text Classification
- URL: http://arxiv.org/abs/2204.04458v1
- Date: Sat, 9 Apr 2022 12:11:59 GMT
- Title: Understanding, Detecting, and Separating Out-of-Distribution Samples and
Adversarial Samples in Text Classification
- Authors: Cheng-Han Chiang and Hung-yi Lee
- Abstract summary: We compare the two types of anomalies (OOD and Adv samples) with the in-distribution (ID) ones from three aspects.
We find that OOD samples expose their aberration starting from the first layer, while the abnormalities of Adv samples do not emerge until the deeper layers of the model.
We propose a simple method to separate ID, OOD, and Adv samples using the hidden representations and output probabilities of the model.
- Score: 80.81532239566992
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we study the differences and commonalities between
statistically out-of-distribution (OOD) samples and adversarial (Adv) samples,
both of which hurting a text classification model's performance. We conduct
analyses to compare the two types of anomalies (OOD and Adv samples) with the
in-distribution (ID) ones from three aspects: the input features, the hidden
representations in each layer of the model, and the output probability
distributions of the classifier. We find that OOD samples expose their
aberration starting from the first layer, while the abnormalities of Adv
samples do not emerge until the deeper layers of the model. We also illustrate
that the models' output probabilities for Adv samples tend to be more
unconfident. Based on our observations, we propose a simple method to separate
ID, OOD, and Adv samples using the hidden representations and output
probabilities of the model. On multiple combinations of ID, OOD datasets, and
Adv attacks, our proposed method shows exceptional results on distinguishing
ID, OOD, and Adv samples.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.