EncoderMI: Membership Inference against Pre-trained Encoders in
Contrastive Learning
- URL: http://arxiv.org/abs/2108.11023v1
- Date: Wed, 25 Aug 2021 03:00:45 GMT
- Title: EncoderMI: Membership Inference against Pre-trained Encoders in
Contrastive Learning
- Authors: Hongbin Liu, Jinyuan Jia, Wenjie Qu, Neil Zhenqiang Gong
- Abstract summary: We proposeMI, the first membership inference method against image encoders pre-trained by contrastive learning.
We evaluateMI on image encoders pre-trained on multiple datasets by ourselves as well as the Contrastive Language-Image Pre-training (CLIP) image encoder, which is pre-trained on 400 million (image, text) pairs collected from the Internet and released by OpenAI.
- Score: 27.54202989524394
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Given a set of unlabeled images or (image, text) pairs, contrastive learning
aims to pre-train an image encoder that can be used as a feature extractor for
many downstream tasks. In this work, we propose EncoderMI, the first membership
inference method against image encoders pre-trained by contrastive learning. In
particular, given an input and a black-box access to an image encoder,
EncoderMI aims to infer whether the input is in the training dataset of the
image encoder. EncoderMI can be used 1) by a data owner to audit whether its
(public) data was used to pre-train an image encoder without its authorization
or 2) by an attacker to compromise privacy of the training data when it is
private/sensitive. Our EncoderMI exploits the overfitting of the image encoder
towards its training data. In particular, an overfitted image encoder is more
likely to output more (or less) similar feature vectors for two augmented
versions of an input in (or not in) its training dataset. We evaluate EncoderMI
on image encoders pre-trained on multiple datasets by ourselves as well as the
Contrastive Language-Image Pre-training (CLIP) image encoder, which is
pre-trained on 400 million (image, text) pairs collected from the Internet and
released by OpenAI. Our results show that EncoderMI can achieve high accuracy,
precision, and recall. We also explore a countermeasure against EncoderMI via
preventing overfitting through early stopping. Our results show that it
achieves trade-offs between accuracy of EncoderMI and utility of the image
encoder, i.e., it can reduce the accuracy of EncoderMI, but it also incurs
classification accuracy loss of the downstream classifiers built based on the
image encoder.
Related papers
- Downstream-agnostic Adversarial Examples [66.8606539786026]
AdvEncoder is first framework for generating downstream-agnostic universal adversarial examples based on pre-trained encoder.
Unlike traditional adversarial example works, the pre-trained encoder only outputs feature vectors rather than classification labels.
Our results show that an attacker can successfully attack downstream tasks without knowing either the pre-training dataset or the downstream dataset.
arXiv Detail & Related papers (2023-07-23T10:16:47Z) - Think Twice before Driving: Towards Scalable Decoders for End-to-End
Autonomous Driving [74.28510044056706]
Existing methods usually adopt the decoupled encoder-decoder paradigm.
In this work, we aim to alleviate the problem by two principles.
We first predict a coarse-grained future position and action based on the encoder features.
Then, conditioned on the position and action, the future scene is imagined to check the ramification if we drive accordingly.
arXiv Detail & Related papers (2023-05-10T15:22:02Z) - Detecting Backdoors in Pre-trained Encoders [25.105186092387633]
We propose DECREE, the first backdoor detection approach for pre-trained encoders.
We show the effectiveness of our method on image encoders pre-trained on ImageNet and OpenAI's CLIP 400 million image-text pairs.
arXiv Detail & Related papers (2023-03-23T19:04:40Z) - AWEncoder: Adversarial Watermarking Pre-trained Encoders in Contrastive
Learning [18.90841192412555]
We introduce AWEncoder, an adversarial method for watermarking the pre-trained encoder in contrastive learning.
The proposed work enjoys pretty good effectiveness and robustness on different contrastive learning algorithms and downstream tasks.
arXiv Detail & Related papers (2022-08-08T07:23:37Z) - LoopITR: Combining Dual and Cross Encoder Architectures for Image-Text
Retrieval [117.15862403330121]
We propose LoopITR, which combines dual encoders and cross encoders in the same network for joint learning.
Specifically, we let the dual encoder provide hard negatives to the cross encoder, and use the more discriminative cross encoder to distill its predictions back to the dual encoder.
arXiv Detail & Related papers (2022-03-10T16:41:12Z) - StolenEncoder: Stealing Pre-trained Encoders [62.02156378126672]
We propose the first attack called StolenEncoder to steal pre-trained image encoders.
Our results show that the encoders stolen by StolenEncoder have similar functionality with the target encoders.
arXiv Detail & Related papers (2022-01-15T17:04:38Z) - Masked Autoencoders Are Scalable Vision Learners [60.97703494764904]
Masked autoencoders (MAE) are scalable self-supervised learners for computer vision.
Our MAE approach is simple: we mask random patches of the input image and reconstruct the missing pixels.
Coupling these two designs enables us to train large models efficiently and effectively.
arXiv Detail & Related papers (2021-11-11T18:46:40Z) - Small Lesion Segmentation in Brain MRIs with Subpixel Embedding [105.1223735549524]
We present a method to segment MRI scans of the human brain into ischemic stroke lesion and normal tissues.
We propose a neural network architecture in the form of a standard encoder-decoder where predictions are guided by a spatial expansion embedding network.
arXiv Detail & Related papers (2021-09-18T00:21:17Z) - A manifold learning perspective on representation learning: Learning
decoder and representations without an encoder [0.0]
Autoencoders are commonly used in representation learning.
Inspired by manifold learning, we show that the decoder can be trained on its own by learning the representations of the training samples.
Our approach of training the decoder alone facilitates representation learning even on small data sets.
arXiv Detail & Related papers (2021-08-31T15:08:50Z) - BadEncoder: Backdoor Attacks to Pre-trained Encoders in Self-Supervised
Learning [29.113263683850015]
Self-supervised learning in computer vision aims to pre-train an image encoder using a large amount of unlabeled images or (image, text) pairs.
We propose BadEncoder, the first backdoor attack to self-supervised learning.
arXiv Detail & Related papers (2021-08-01T02:22:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.