On Sensitivity of Deep Learning Based Text Classification Algorithms to
Practical Input Perturbations
- URL: http://arxiv.org/abs/2201.00318v1
- Date: Sun, 2 Jan 2022 08:33:49 GMT
- Title: On Sensitivity of Deep Learning Based Text Classification Algorithms to
Practical Input Perturbations
- Authors: Aamir Miyajiwala, Arnav Ladkat, Samiksha Jagadale, Raviraj Joshi
- Abstract summary: We evaluate the impact of systematic practical perturbations on the performance of deep learning based text classification models.
The perturbations are induced by the addition and removal of unwanted tokens like punctuation and stop-words.
We show that these deep learning approaches including BERT are sensitive to such legitimate input perturbations on four standard benchmark datasets.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text classification is a fundamental Natural Language Processing task that
has a wide variety of applications, where deep learning approaches have
produced state-of-the-art results. While these models have been heavily
criticized for their black-box nature, their robustness to slight perturbations
in input text has been a matter of concern. In this work, we carry out a
data-focused study evaluating the impact of systematic practical perturbations
on the performance of the deep learning based text classification models like
CNN, LSTM, and BERT-based algorithms. The perturbations are induced by the
addition and removal of unwanted tokens like punctuation and stop-words that
are minimally associated with the final performance of the model. We show that
these deep learning approaches including BERT are sensitive to such legitimate
input perturbations on four standard benchmark datasets SST2, TREC-6, BBC News,
and tweet_eval. We observe that BERT is more susceptible to the removal of
tokens as compared to the addition of tokens. Moreover, LSTM is slightly more
sensitive to input perturbations as compared to CNN based model. The work also
serves as a practical guide to assessing the impact of discrepancies in
train-test conditions on the final performance of models.
Related papers
- Boosting Semi-Supervised Scene Text Recognition via Viewing and Summarizing [71.29488677105127]
Existing scene text recognition (STR) methods struggle to recognize challenging texts, especially for artistic and severely distorted characters.
We propose a contrastive learning-based STR framework by leveraging synthetic and real unlabeled data without any human cost.
Our method achieves SOTA performance (94.7% and 70.9% average accuracy on common benchmarks and Union14M-Benchmark.
arXiv Detail & Related papers (2024-11-23T15:24:47Z) - How Hard is this Test Set? NLI Characterization by Exploiting Training Dynamics [49.9329723199239]
We propose a method for the automated creation of a challenging test set without relying on the manual construction of artificial and unrealistic examples.
We categorize the test set of popular NLI datasets into three difficulty levels by leveraging methods that exploit training dynamics.
When our characterization method is applied to the training set, models trained with only a fraction of the data achieve comparable performance to those trained on the full dataset.
arXiv Detail & Related papers (2024-10-04T13:39:21Z) - Evaluating Generative Language Models in Information Extraction as Subjective Question Correction [49.729908337372436]
We propose a new evaluation method, SQC-Score.
Inspired by the principles in subjective question correction, we propose a new evaluation method, SQC-Score.
Results on three information extraction tasks show that SQC-Score is more preferred by human annotators than the baseline metrics.
arXiv Detail & Related papers (2024-04-04T15:36:53Z) - A Visual Interpretation-Based Self-Improved Classification System Using
Virtual Adversarial Training [4.722922834127293]
This paper proposes a visual interpretation-based self-improving classification model with a combination of virtual adversarial training (VAT) and BERT models to address the problems.
Specifically, a fine-tuned BERT model is used as a classifier to classify the sentiment of the text.
The predicted sentiment classification labels are used as part of the input of another BERT for spam classification via a semi-supervised training manner.
arXiv Detail & Related papers (2023-09-03T15:07:24Z) - Towards Harnessing Feature Embedding for Robust Learning with Noisy
Labels [44.133307197696446]
The memorization effect of deep neural networks (DNNs) plays a pivotal role in recent label noise learning methods.
We propose a novel feature embedding-based method for deep learning with label noise, termed LabEl NoiseDilution (LEND)
arXiv Detail & Related papers (2022-06-27T02:45:09Z) - Improving Pre-trained Language Model Fine-tuning with Noise Stability
Regularization [94.4409074435894]
We propose a novel and effective fine-tuning framework, named Layerwise Noise Stability Regularization (LNSR)
Specifically, we propose to inject the standard Gaussian noise and regularize hidden representations of the fine-tuned model.
We demonstrate the advantages of the proposed method over other state-of-the-art algorithms including L2-SP, Mixout and SMART.
arXiv Detail & Related papers (2022-06-12T04:42:49Z) - AES Systems Are Both Overstable And Oversensitive: Explaining Why And
Proposing Defenses [66.49753193098356]
We investigate the reason behind the surprising adversarial brittleness of scoring models.
Our results indicate that autoscoring models, despite getting trained as "end-to-end" models, behave like bag-of-words models.
We propose detection-based protection models that can detect oversensitivity and overstability causing samples with high accuracies.
arXiv Detail & Related papers (2021-09-24T03:49:38Z) - Evaluating the Robustness of Neural Language Models to Input
Perturbations [7.064032374579076]
In this study, we design and implement various types of character-level and word-level perturbation methods to simulate noisy input texts.
We investigate the ability of high-performance language models such as BERT, XLNet, RoBERTa, and ELMo in handling different types of input perturbations.
The results suggest that language models are sensitive to input perturbations and their performance can decrease even when small changes are introduced.
arXiv Detail & Related papers (2021-08-27T12:31:17Z) - Learning Variational Word Masks to Improve the Interpretability of
Neural Text Classifiers [21.594361495948316]
A new line of work on improving model interpretability has just started, and many existing methods require either prior information or human annotations as additional inputs in training.
We propose the variational word mask (VMASK) method to automatically learn task-specific important words and reduce irrelevant information on classification, which ultimately improves the interpretability of model predictions.
arXiv Detail & Related papers (2020-10-01T20:02:43Z) - TAVAT: Token-Aware Virtual Adversarial Training for Language
Understanding [55.16953347580948]
Gradient-based adversarial training is widely used in improving the robustness of neural networks.
It cannot be easily adapted to natural language processing tasks since the embedding space is discrete.
We propose a Token-Aware Virtual Adrial Training method to craft fine-grained perturbations.
arXiv Detail & Related papers (2020-04-30T02:03:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.