Related papers: Downstream-agnostic Adversarial Examples

Downstream-agnostic Adversarial Examples

URL: http://arxiv.org/abs/2307.12280v2
Date: Mon, 14 Aug 2023 11:16:44 GMT
Title: Downstream-agnostic Adversarial Examples
Authors: Ziqi Zhou, Shengshan Hu, Ruizhi Zhao, Qian Wang, Leo Yu Zhang, Junhui Hou, Hai Jin
Abstract summary: AdvEncoder is first framework for generating downstream-agnostic universal adversarial examples based on pre-trained encoder. Unlike traditional adversarial example works, the pre-trained encoder only outputs feature vectors rather than classification labels. Our results show that an attacker can successfully attack downstream tasks without knowing either the pre-training dataset or the downstream dataset.
Score: 66.8606539786026
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Self-supervised learning usually uses a large amount of unlabeled data to pre-train an encoder which can be used as a general-purpose feature extractor, such that downstream users only need to perform fine-tuning operations to enjoy the benefit of "large model". Despite this promising prospect, the security of pre-trained encoder has not been thoroughly investigated yet, especially when the pre-trained encoder is publicly available for commercial use. In this paper, we propose AdvEncoder, the first framework for generating downstream-agnostic universal adversarial examples based on the pre-trained encoder. AdvEncoder aims to construct a universal adversarial perturbation or patch for a set of natural images that can fool all the downstream tasks inheriting the victim pre-trained encoder. Unlike traditional adversarial example works, the pre-trained encoder only outputs feature vectors rather than classification labels. Therefore, we first exploit the high frequency component information of the image to guide the generation of adversarial examples. Then we design a generative attack framework to construct adversarial perturbations/patches by learning the distribution of the attack surrogate dataset to improve their attack success rates and transferability. Our results show that an attacker can successfully attack downstream tasks without knowing either the pre-training dataset or the downstream dataset. We also tailor four defenses for pre-trained encoders, the results of which further prove the attack ability of AdvEncoder.

Related papers

Fooling the Decoder: An Adversarial Attack on Quantum Error Correction [49.48516314472825]
In this work, we target a basic RL surface code decoder (DeepQ) to create the first adversarial attack on quantum error correction. We demonstrate an attack that reduces the logical qubit lifetime in memory experiments by up to five orders of magnitude. This attack highlights the susceptibility of machine learning-based QEC and underscores the importance of further research into robust QEC methods.
arXiv Detail & Related papers (2025-04-28T10:10:05Z)
Probe-Me-Not: Protecting Pre-trained Encoders from Malicious Probing [14.290156958543845]
Adapting pre-trained deep learning models to customized tasks has become a popular choice for developers. probing--training a downstream head on a pre-trained encoder--has been widely adopted in transfer learning. Such generalizability of pre-trained encoders raises concerns about the potential misuse of probing for harmful intentions. We introduceLock, a novel applicability authorization method designed to protect pre-trained encoders from malicious probing.
arXiv Detail & Related papers (2024-11-19T13:50:08Z)
Pre-trained Encoder Inference: Revealing Upstream Encoders In Downstream Machine Learning Services [10.367966878807714]
Pre-trained encoders can be easily accessed online to build downstream machine learning (ML) services quickly. This paper unveils a new vulnerability: the Pre-trained Inference (PEI) attack, which posts privacy threats toward encoders hidden behind downstream ML services.
arXiv Detail & Related papers (2024-08-05T20:27:54Z)
Think Twice before Driving: Towards Scalable Decoders for End-to-End Autonomous Driving [74.28510044056706]
Existing methods usually adopt the decoupled encoder-decoder paradigm. In this work, we aim to alleviate the problem by two principles. We first predict a coarse-grained future position and action based on the encoder features. Then, conditioned on the position and action, the future scene is imagined to check the ramification if we drive accordingly.
arXiv Detail & Related papers (2023-05-10T15:22:02Z)
Adversarial Pixel Restoration as a Pretext Task for Transferable Perturbations [54.1807206010136]
Transferable adversarial attacks optimize adversaries from a pretrained surrogate model and known label space to fool the unknown black-box models. We propose Adversarial Pixel Restoration as a self-supervised alternative to train an effective surrogate model from scratch. Our training approach is based on a min-max objective which reduces overfitting via an adversarial objective.
arXiv Detail & Related papers (2022-07-18T17:59:58Z)
PoisonedEncoder: Poisoning the Unlabeled Pre-training Data in Contrastive Learning [69.70602220716718]
We propose PoisonedEncoder, a data poisoning attack to contrastive learning. In particular, an attacker injects carefully crafted poisoning inputs into the unlabeled pre-training data. We evaluate five defenses against PoisonedEncoder, including one pre-processing, three in-processing, and one post-processing defenses.
arXiv Detail & Related papers (2022-05-13T00:15:44Z)
Can't Steal? Cont-Steal! Contrastive Stealing Attacks Against Image Encoders [23.2869445054295]
Self-supervised representation learning techniques encode images into rich features that are oblivious to downstream tasks. The requirements for dedicated model designs and a massive amount of resources expose image encoders to the risks of potential model stealing attacks. We propose Cont-Steal, a contrastive-learning-based attack, and validate its improved stealing effectiveness in various experiment settings.
arXiv Detail & Related papers (2022-01-19T10:27:28Z)
StolenEncoder: Stealing Pre-trained Encoders [62.02156378126672]
We propose the first attack called StolenEncoder to steal pre-trained image encoders. Our results show that the encoders stolen by StolenEncoder have similar functionality with the target encoders.
arXiv Detail & Related papers (2022-01-15T17:04:38Z)
Cross-Thought for Sentence Encoder Pre-training [89.32270059777025]
Cross-Thought is a novel approach to pre-training sequence encoder. We train a Transformer-based sequence encoder over a large set of short sequences. Experiments on question answering and textual entailment tasks demonstrate that our pre-trained encoder can outperform state-of-the-art encoders.
arXiv Detail & Related papers (2020-10-07T21:02:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.