Systematic Attack Surface Reduction For Deployed Sentiment Analysis
Models
- URL: http://arxiv.org/abs/2006.11130v1
- Date: Fri, 19 Jun 2020 13:41:38 GMT
- Title: Systematic Attack Surface Reduction For Deployed Sentiment Analysis
Models
- Authors: Josh Kalin, David Noever, Gerry Dozier
- Abstract summary: This work proposes a structured approach to baselining a model, identifying attack vectors, and securing the machine learning models after deployment.
The BAD architecture is evaluated to quantify the adversarial life cycle for a black box Sentiment Analysis system.
The goal is to demonstrate a viable methodology for securing a machine learning model in a production setting.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This work proposes a structured approach to baselining a model, identifying
attack vectors, and securing the machine learning models after deployment. This
method for securing each model post deployment is called the BAD (Build,
Attack, and Defend) Architecture. Two implementations of the BAD architecture
are evaluated to quantify the adversarial life cycle for a black box Sentiment
Analysis system. As a challenging diagnostic, the Jigsaw Toxic Bias dataset is
selected as the baseline in our performance tool. Each implementation of the
architecture will build a baseline performance report, attack a common
weakness, and defend the incoming attack. As an important note: each attack
surface demonstrated in this work is detectable and preventable. The goal is to
demonstrate a viable methodology for securing a machine learning model in a
production setting.
Related papers
- MisGUIDE : Defense Against Data-Free Deep Learning Model Extraction [0.8437187555622164]
"MisGUIDE" is a two-step defense framework for Deep Learning models that disrupts the adversarial sample generation process.
The aim of the proposed defense method is to reduce the accuracy of the cloned model while maintaining accuracy on authentic queries.
arXiv Detail & Related papers (2024-03-27T13:59:21Z) - Learn from the Past: A Proxy Guided Adversarial Defense Framework with
Self Distillation Regularization [53.04697800214848]
Adversarial Training (AT) is pivotal in fortifying the robustness of deep learning models.
AT methods, relying on direct iterative updates for target model's defense, frequently encounter obstacles such as unstable training and catastrophic overfitting.
We present a general proxy guided defense framework, LAST' (bf Learn from the Pbf ast)
arXiv Detail & Related papers (2023-10-19T13:13:41Z) - CARLA-GeAR: a Dataset Generator for a Systematic Evaluation of
Adversarial Robustness of Vision Models [61.68061613161187]
This paper presents CARLA-GeAR, a tool for the automatic generation of synthetic datasets for evaluating the robustness of neural models against physical adversarial patches.
The tool is built on the CARLA simulator, using its Python API, and allows the generation of datasets for several vision tasks in the context of autonomous driving.
The paper presents an experimental study to evaluate the performance of some defense methods against such attacks, showing how the datasets generated with CARLA-GeAR might be used in future work as a benchmark for adversarial defense in the real world.
arXiv Detail & Related papers (2022-06-09T09:17:38Z) - A Simple Structure For Building A Robust Model [7.8383976168377725]
We propose a simple architecture to build a model with a certain degree of robustness, which improves the robustness of the trained network by adding an adversarial sample detection network for cooperative training.
We conduct some experiments to test the effectiveness of this design based on Cifar10 dataset, and the results indicate that it has some degree of positive effect on the robustness of the model.
arXiv Detail & Related papers (2022-04-25T12:30:35Z) - DST: Dynamic Substitute Training for Data-free Black-box Attack [79.61601742693713]
We propose a novel dynamic substitute training attack method to encourage substitute model to learn better and faster from the target model.
We introduce a task-driven graph-based structure information learning constrain to improve the quality of generated training data.
arXiv Detail & Related papers (2022-04-03T02:29:11Z) - Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks.
We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z) - Hiding Behind Backdoors: Self-Obfuscation Against Generative Models [8.782809316491948]
Attack that compromise machine learning pipelines in the physical world have been demonstrated in recent research.
We illustrate the self-obfuscation attack: attackers target a pre-processing model in the system, and poison the training set of generative models to obfuscate a specific class during inference.
arXiv Detail & Related papers (2022-01-24T16:05:41Z) - Reconstructing Training Data with Informed Adversaries [30.138217209991826]
Given access to a machine learning model, can an adversary reconstruct the model's training data?
This work studies this question from the lens of a powerful informed adversary who knows all the training data points except one.
We show it is feasible to reconstruct the remaining data point in this stringent threat model.
arXiv Detail & Related papers (2022-01-13T09:19:25Z) - ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine
Learning Models [64.03398193325572]
Inference attacks against Machine Learning (ML) models allow adversaries to learn about training data, model parameters, etc.
We concentrate on four attacks - namely, membership inference, model inversion, attribute inference, and model stealing.
Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models.
arXiv Detail & Related papers (2021-02-04T11:35:13Z) - Adversarial Attack and Defense of Structured Prediction Models [58.49290114755019]
In this paper, we investigate attacks and defenses for structured prediction tasks in NLP.
The structured output of structured prediction models is sensitive to small perturbations in the input.
We propose a novel and unified framework that learns to attack a structured prediction model using a sequence-to-sequence model.
arXiv Detail & Related papers (2020-10-04T15:54:03Z) - Feature Partitioning for Robust Tree Ensembles and their Certification
in Adversarial Scenarios [8.300942601020266]
We focus on evasion attacks, where a model is trained in a safe environment and exposed to attacks at test time.
We propose a model-agnostic strategy that builds a robust ensemble by training its basic models on feature-based partitions of the given dataset.
Our algorithm guarantees that the majority of the models in the ensemble cannot be affected by the attacker.
arXiv Detail & Related papers (2020-04-07T12:00:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.