Hiding Behind Backdoors: Self-Obfuscation Against Generative Models
- URL: http://arxiv.org/abs/2201.09774v1
- Date: Mon, 24 Jan 2022 16:05:41 GMT
- Title: Hiding Behind Backdoors: Self-Obfuscation Against Generative Models
- Authors: Siddhartha Datta, Nigel Shadbolt
- Abstract summary: Attack that compromise machine learning pipelines in the physical world have been demonstrated in recent research.
We illustrate the self-obfuscation attack: attackers target a pre-processing model in the system, and poison the training set of generative models to obfuscate a specific class during inference.
- Score: 8.782809316491948
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Attack vectors that compromise machine learning pipelines in the physical
world have been demonstrated in recent research, from perturbations to
architectural components. Building on this work, we illustrate the
self-obfuscation attack: attackers target a pre-processing model in the system,
and poison the training set of generative models to obfuscate a specific class
during inference. Our contribution is to describe, implement and evaluate a
generalized attack, in the hope of raising awareness regarding the challenge of
architectural robustness within the machine learning community.
Related papers
- Model-agnostic clean-label backdoor mitigation in cybersecurity environments [6.857489153636145]
Recent research has surfaced a series of insidious training-time attacks that inject backdoors in models designed for security classification tasks.
We propose new techniques that leverage insights in cybersecurity threat models to effectively mitigate these clean-label poisoning attacks.
arXiv Detail & Related papers (2024-07-11T03:25:40Z) - Unlearning Backdoor Threats: Enhancing Backdoor Defense in Multimodal Contrastive Learning via Local Token Unlearning [49.242828934501986]
Multimodal contrastive learning has emerged as a powerful paradigm for building high-quality features.
backdoor attacks subtly embed malicious behaviors within the model during training.
We introduce an innovative token-based localized forgetting training regime.
arXiv Detail & Related papers (2024-03-24T18:33:15Z) - Decentralized Adversarial Training over Graphs [55.28669771020857]
The vulnerability of machine learning models to adversarial attacks has been attracting considerable attention in recent years.
This work studies adversarial training over graphs, where individual agents are subjected to varied strength perturbation space.
arXiv Detail & Related papers (2023-03-23T15:05:16Z) - Attacks in Adversarial Machine Learning: A Systematic Survey from the
Life-cycle Perspective [69.25513235556635]
Adversarial machine learning (AML) studies the adversarial phenomenon of machine learning, which may make inconsistent or unexpected predictions with humans.
Some paradigms have been recently developed to explore this adversarial phenomenon occurring at different stages of a machine learning system.
We propose a unified mathematical framework to covering existing attack paradigms.
arXiv Detail & Related papers (2023-02-19T02:12:21Z) - The Space of Adversarial Strategies [6.295859509997257]
Adversarial examples, inputs designed to induce worst-case behavior in machine learning models, have been extensively studied over the past decade.
We propose a systematic approach to characterize worst-case (i.e., optimal) adversaries.
arXiv Detail & Related papers (2022-09-09T20:53:11Z) - On the Properties of Adversarially-Trained CNNs [4.769747792846005]
Adversarial Training has proved to be an effective training paradigm to enforce robustness against adversarial examples in modern neural network architectures.
We describe surprising properties of adversarially-trained models, shedding light on mechanisms through which robustness against adversarial attacks is implemented.
arXiv Detail & Related papers (2022-03-17T11:11:52Z) - SparseFed: Mitigating Model Poisoning Attacks in Federated Learning with
Sparsification [24.053704318868043]
In model poisoning attacks, the attacker reduces the model's performance on targeted sub-tasks by uploading "poisoned" updates.
We introduce algoname, a novel defense that uses global top-k update sparsification and device-level clipping gradient to mitigate model poisoning attacks.
arXiv Detail & Related papers (2021-12-12T16:34:52Z) - Towards A Conceptually Simple Defensive Approach for Few-shot
classifiers Against Adversarial Support Samples [107.38834819682315]
We study a conceptually simple approach to defend few-shot classifiers against adversarial attacks.
We propose a simple attack-agnostic detection method, using the concept of self-similarity and filtering.
Our evaluation on the miniImagenet (MI) and CUB datasets exhibit good attack detection performance.
arXiv Detail & Related papers (2021-10-24T05:46:03Z) - ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine
Learning Models [64.03398193325572]
Inference attacks against Machine Learning (ML) models allow adversaries to learn about training data, model parameters, etc.
We concentrate on four attacks - namely, membership inference, model inversion, attribute inference, and model stealing.
Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models.
arXiv Detail & Related papers (2021-02-04T11:35:13Z) - Adversarial Attack and Defense of Structured Prediction Models [58.49290114755019]
In this paper, we investigate attacks and defenses for structured prediction tasks in NLP.
The structured output of structured prediction models is sensitive to small perturbations in the input.
We propose a novel and unified framework that learns to attack a structured prediction model using a sequence-to-sequence model.
arXiv Detail & Related papers (2020-10-04T15:54:03Z) - Systematic Attack Surface Reduction For Deployed Sentiment Analysis
Models [0.0]
This work proposes a structured approach to baselining a model, identifying attack vectors, and securing the machine learning models after deployment.
The BAD architecture is evaluated to quantify the adversarial life cycle for a black box Sentiment Analysis system.
The goal is to demonstrate a viable methodology for securing a machine learning model in a production setting.
arXiv Detail & Related papers (2020-06-19T13:41:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.