Related papers: Do SSL Models Have D\'ej\`a Vu? A Case of Unintended Memorization in Self-supervised Learning

Do SSL Models Have D\'ej\`a Vu? A Case of Unintended Memorization in Self-supervised Learning

URL: http://arxiv.org/abs/2304.13850v3
Date: Wed, 13 Dec 2023 03:31:18 GMT
Title: Do SSL Models Have D\'ej\`a Vu? A Case of Unintended Memorization in Self-supervised Learning
Authors: Casey Meehan, Florian Bordes, Pascal Vincent, Kamalika Chaudhuri, Chuan Guo
Abstract summary: Self-supervised learning (SSL) algorithms can produce useful image representations by learning to associate different parts of natural images with one another. SSL models can unintendedly memorize specific parts in individual training samples rather than learning semantically meaningful associations. We show that given the trained model and a crop of a training image containing only the background, it is possible to infer the foreground object with high accuracy.
Score: 47.46863155263094
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Self-supervised learning (SSL) algorithms can produce useful image representations by learning to associate different parts of natural images with one another. However, when taken to the extreme, SSL models can unintendedly memorize specific parts in individual training samples rather than learning semantically meaningful associations. In this work, we perform a systematic study of the unintended memorization of image-specific information in SSL models -- which we refer to as d\'ej\`a vu memorization. Concretely, we show that given the trained model and a crop of a training image containing only the background (e.g., water, sky, grass), it is possible to infer the foreground object with high accuracy or even visually reconstruct it. Furthermore, we show that d\'ej\`a vu memorization is common to different SSL algorithms, is exacerbated by certain design choices, and cannot be detected by conventional techniques for evaluating representation quality. Our study of d\'ej\`a vu memorization reveals previously unknown privacy risks in SSL models, as well as suggests potential practical mitigation strategies. Code is available at https://github.com/facebookresearch/DejaVu.

Related papers

SMILE: Infusing Spatial and Motion Semantics in Masked Video Learning [50.98341607245458]
Masked video modeling is an effective paradigm for video self-supervised learning (SSL) This paper introduces a novel SSL approach for video representation learning, dubbed as SMILE, by infusing both spatial and motion semantics. We establish a new self-supervised video learning paradigm capable of learning strong video representations without requiring any natural video data.
arXiv Detail & Related papers (2025-04-01T08:20:55Z)
Towards Adversarial Robustness And Backdoor Mitigation in SSL [0.562479170374811]
Self-Supervised Learning (SSL) has shown great promise in learning representations from unlabeled data. SSL methods have recently been shown to be vulnerable to backdoor attacks. This work aims to address defending against backdoor attacks in SSL.
arXiv Detail & Related papers (2024-03-23T19:21:31Z)
Déjà Vu Memorization in Vision-Language Models [39.51189095703773]
We propose a new method for measuring memorization in Vision-Language Models (VLMs) We show that the model indeed retains information about individual objects in the training images beyond what can be inferred from correlations or the image caption. We evaluate d'eja vu memorization at both sample and population level, and show that it is significant for OpenCLIP trained on as many as 50M image-caption pairs.
arXiv Detail & Related papers (2024-02-03T09:55:35Z)
Zero-Shot Learning by Harnessing Adversarial Samples [52.09717785644816]
We propose a novel Zero-Shot Learning (ZSL) approach by Harnessing Adversarial Samples (HAS) HAS advances ZSL through adversarial training which takes into account three crucial aspects. We demonstrate the effectiveness of our adversarial samples approach in both ZSL and Generalized Zero-Shot Learning (GZSL) scenarios.
arXiv Detail & Related papers (2023-08-01T06:19:13Z)
HIRL: A General Framework for Hierarchical Image Representation Learning [54.12773508883117]
We propose a general framework for Hierarchical Image Representation Learning (HIRL) This framework aims to learn multiple semantic representations for each image, and these representations are structured to encode image semantics from fine-grained to coarse-grained. Based on a probabilistic factorization, HIRL learns the most fine-grained semantics by an off-the-shelf image SSL approach and learns multiple coarse-grained semantics by a novel semantic path discrimination scheme.
arXiv Detail & Related papers (2022-05-26T05:13:26Z)
Unified Contrastive Learning in Image-Text-Label Space [130.31947133453406]
Unified Contrastive Learning (UniCL) is effective way of learning semantically rich yet discriminative representations. UniCL stand-alone is a good learner on pure imagelabel data, rivaling the supervised learning methods across three image classification datasets.
arXiv Detail & Related papers (2022-04-07T17:34:51Z)
Generalized Zero Shot Learning For Medical Image Classification [5.6512908295414]
In many real world medical image classification settings we do not have access to samples of all possible disease classes. We propose a generalized zero shot learning (GZSL) method that uses self supervised learning (SSL) Our approach does not require class attribute vectors which are available for natural images but not for medical images.
arXiv Detail & Related papers (2022-04-04T09:30:08Z)
High Fidelity Visualization of What Your Self-Supervised Representation Knows About [22.982471878833362]
In this work, we showcase the use of a conditional diffusion based generative model (RCDM) to visualize representations learned with self-supervised models. We demonstrate how this model's generation quality is on par with state-of-the-art generative models while being faithful to the representation used as conditioning.
arXiv Detail & Related papers (2021-12-16T19:23:33Z)
Constrained Mean Shift for Representation Learning [17.652439157554877]
We develop a non-contrastive representation learning method that can exploit additional knowledge. Our main idea is to generalize the mean-shift algorithm by constraining the search space of nearest neighbors. We show that it is possible to use the noisy constraint across modalities to train self-supervised video models.
arXiv Detail & Related papers (2021-10-19T23:14:23Z)
Information Bottleneck Constrained Latent Bidirectional Embedding for Zero-Shot Learning [59.58381904522967]
We propose a novel embedding based generative model with a tight visual-semantic coupling constraint. We learn a unified latent space that calibrates the embedded parametric distributions of both visual and semantic spaces. Our method can be easily extended to transductive ZSL setting by generating labels for unseen images.
arXiv Detail & Related papers (2020-09-16T03:54:12Z)
Self-supervised Visual Attribute Learning for Fashion Compatibility [71.73414832639698]
We present an SSL framework that enables us to learn color and texture-aware features without requiring any labels during training. Our approach consists of three self-supervised tasks designed to capture different concepts that are neglected in prior work. We show that our approach can be used for transfer learning, demonstrating that we can train on one dataset while achieving high performance on a different dataset.
arXiv Detail & Related papers (2020-08-01T21:53:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.