Evaluating Neural Machine Comprehension Model Robustness to Noisy Inputs
and Adversarial Attacks
- URL: http://arxiv.org/abs/2005.00190v1
- Date: Fri, 1 May 2020 03:05:43 GMT
- Title: Evaluating Neural Machine Comprehension Model Robustness to Noisy Inputs
and Adversarial Attacks
- Authors: Winston Wu, Dustin Arendt, Svitlana Volkova
- Abstract summary: We evaluate machine comprehension models' robustness to noise and adversarial attacks by performing novel perturbations at the character, word, and sentence level.
We develop a model to predict model errors during adversarial attacks.
- Score: 9.36331571226256
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We evaluate machine comprehension models' robustness to noise and adversarial
attacks by performing novel perturbations at the character, word, and sentence
level. We experiment with different amounts of perturbations to examine model
confidence and misclassification rate, and contrast model performance in
adversarial training with different embedding types on two benchmark datasets.
We demonstrate improving model performance with ensembling. Finally, we analyze
factors that effect model behavior under adversarial training and develop a
model to predict model errors during adversarial attacks.
Related papers
- A Robust Adversarial Ensemble with Causal (Feature Interaction) Interpretations for Image Classification [9.945272787814941]
We present a deep ensemble model that combines discriminative features with generative models to achieve both high accuracy and adversarial robustness.
Our approach integrates a bottom-level pre-trained discriminative network for feature extraction with a top-level generative classification network that models adversarial input distributions.
arXiv Detail & Related papers (2024-12-28T05:06:20Z) - Adversarial Transferability in Deep Denoising Models: Theoretical Insights and Robustness Enhancement via Out-of-Distribution Typical Set Sampling [6.189440665620872]
Deep learning-based image denoising models demonstrate remarkable performance, but their lack of robustness analysis remains a significant concern.
A major issue is that these models are susceptible to adversarial attacks, where small, carefully crafted perturbations to input data can cause them to fail.
We propose a novel adversarial defense method: the Out-of-Distribution Typical Set Sampling Training strategy.
arXiv Detail & Related papers (2024-12-08T13:47:57Z) - Assessing Robustness of Machine Learning Models using Covariate Perturbations [0.6749750044497732]
This paper proposes a comprehensive framework for assessing the robustness of machine learning models.
We explore various perturbation strategies to assess robustness and examine their impact on model predictions.
We demonstrate the effectiveness of our approach in comparing robustness across models, identifying the instabilities in the model, and enhancing model robustness.
arXiv Detail & Related papers (2024-08-02T14:41:36Z) - Adversarial Fine-tuning of Compressed Neural Networks for Joint Improvement of Robustness and Efficiency [3.3490724063380215]
Adrial training has been presented as a mitigation strategy which can result in more robust models.
We explore the effects of two different model compression methods -- structured weight pruning and quantization -- on adversarial robustness.
We show that adversarial fine-tuning of compressed models can achieve robustness performance comparable to adversarially trained models.
arXiv Detail & Related papers (2024-03-14T14:34:25Z) - Are Neural Topic Models Broken? [81.15470302729638]
We study the relationship between automated and human evaluation of topic models.
We find that neural topic models fare worse in both respects compared to an established classical method.
arXiv Detail & Related papers (2022-10-28T14:38:50Z) - Evaluating Deception Detection Model Robustness To Linguistic Variation [10.131671217810581]
We propose an analysis of model robustness against linguistic variation in the setting of deceptive news detection.
We consider two prediction tasks and compare three state-of-the-art embeddings to highlight consistent trends in model performance.
We find that character or mixed ensemble models are the most effective defenses and that character perturbation-based attack tactics are more successful.
arXiv Detail & Related papers (2021-04-23T17:25:38Z) - ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine
Learning Models [64.03398193325572]
Inference attacks against Machine Learning (ML) models allow adversaries to learn about training data, model parameters, etc.
We concentrate on four attacks - namely, membership inference, model inversion, attribute inference, and model stealing.
Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models.
arXiv Detail & Related papers (2021-02-04T11:35:13Z) - Firearm Detection via Convolutional Neural Networks: Comparing a
Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents.
One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis.
We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z) - On the Transferability of Adversarial Attacksagainst Neural Text
Classifier [121.6758865857686]
We investigate the transferability of adversarial examples for text classification models.
We propose a genetic algorithm to find an ensemble of models that can induce adversarial examples to fool almost all existing models.
We derive word replacement rules that can be used for model diagnostics from these adversarial examples.
arXiv Detail & Related papers (2020-11-17T10:45:05Z) - Stereopagnosia: Fooling Stereo Networks with Adversarial Perturbations [71.00754846434744]
We show that imperceptible additive perturbations can significantly alter the disparity map.
We show that, when used for adversarial data augmentation, our perturbations result in trained models that are more robust.
arXiv Detail & Related papers (2020-09-21T19:20:09Z) - Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial
Perturbations [65.05561023880351]
Adversarial examples are malicious inputs crafted to induce misclassification.
This paper studies a complementary failure mode, invariance-based adversarial examples.
We show that defenses against sensitivity-based attacks actively harm a model's accuracy on invariance-based attacks.
arXiv Detail & Related papers (2020-02-11T18:50:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.