Related papers: Reconstructing Training Data From Real World Models Trained with Transfer Learning

Reconstructing Training Data From Real World Models Trained with Transfer Learning

URL: http://arxiv.org/abs/2407.15845v1
Date: Mon, 22 Jul 2024 17:59:10 GMT
Title: Reconstructing Training Data From Real World Models Trained with Transfer Learning
Authors: Yakir Oz, Gilad Yehudai, Gal Vardi, Itai Antebi, Michal Irani, Niv Haim,
Abstract summary: We present a novel approach enabling data reconstruction in realistic settings for models trained on high-resolution images. Our method adapts the reconstruction scheme of arXiv:2206.07758 to real-world scenarios. We introduce a novel clustering-based method to identify good reconstructions from thousands of candidates.
Score: 29.028185455223785
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Current methods for reconstructing training data from trained classifiers are restricted to very small models, limited training set sizes, and low-resolution images. Such restrictions hinder their applicability to real-world scenarios. In this paper, we present a novel approach enabling data reconstruction in realistic settings for models trained on high-resolution images. Our method adapts the reconstruction scheme of arXiv:2206.07758 to real-world scenarios -- specifically, targeting models trained via transfer learning over image embeddings of large pre-trained models like DINO-ViT and CLIP. Our work employs data reconstruction in the embedding space rather than in the image space, showcasing its applicability beyond visual data. Moreover, we introduce a novel clustering-based method to identify good reconstructions from thousands of candidates. This significantly improves on previous works that relied on knowledge of the training set to identify good reconstructed images. Our findings shed light on a potential privacy risk for data leakage from models trained using transfer learning.

Related papers

Low Resource Reconstruction Attacks Through Benign Prompts [12.077836270816622]
We devise a new attack that requires low resources, assumes little to no access to the actual training set, and identifies, seemingly, benign prompts that lead to potentially-risky image reconstruction.<n>This highlights the risk that images might even be reconstructed by an uninformed user and unintentionally.
arXiv Detail & Related papers (2025-07-10T17:32:26Z)
Training Data Reconstruction: Privacy due to Uncertainty? [36.941445388011154]
We show that a random initialisation of $x$ can lead to reconstructions that resemble valid training samples while not being part of the actual training dataset. Our experiments on affine and one-hidden layer networks suggest that when reconstructing natural images, yet an adversary cannot identify whether reconstructed images have indeed been part of the set of training samples.
arXiv Detail & Related papers (2024-12-11T17:00:29Z)
FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration [66.61201445650323]
Existing methods suffer from a generalization bottleneck in real-world scenarios. We contribute a million-scale dataset with two notable advantages over existing training data. We propose a robust model, FoundIR, to better address a broader range of restoration tasks in real-world scenarios.
arXiv Detail & Related papers (2024-12-02T12:08:40Z)
Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images. We identify model weaknesses by testing the model using the counterfactual image dataset. We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z)
Bounding Reconstruction Attack Success of Adversaries Without Data Priors [53.41619942066895]
Reconstruction attacks on machine learning (ML) models pose a strong risk of leakage of sensitive data. In this work, we provide formal upper bounds on reconstruction success under realistic adversarial settings.
arXiv Detail & Related papers (2024-02-20T09:52:30Z)
RefinedFields: Radiance Fields Refinement for Unconstrained Scenes [7.421845364041002]
We propose RefinedFields, to the best of our knowledge, the first method leveraging pre-trained models to improve in-the-wild scene modeling. We employ pre-trained networks to refine K-Planes representations via optimization guidance. We carry out extensive experiments and verify the merit of our method on synthetic data and real tourism photo collections.
arXiv Detail & Related papers (2023-12-01T14:59:43Z)
Instant Continual Learning of Neural Radiance Fields [78.08008474313809]
Neural radiance fields (NeRFs) have emerged as an effective method for novel-view synthesis and 3D scene reconstruction. We propose a continual learning framework for training NeRFs that leverages replay-based methods combined with a hybrid explicit-implicit scene representation. Our method outperforms previous methods in reconstruction quality when trained in a continual setting, while having the additional benefit of being an order of magnitude faster.
arXiv Detail & Related papers (2023-09-04T21:01:55Z)
Understanding Reconstruction Attacks with the Neural Tangent Kernel and Dataset Distillation [110.61853418925219]
We build a stronger version of the dataset reconstruction attack and show how it can provably recover the emphentire training set in the infinite width regime. We show that both theoretically and empirically, reconstructed images tend to "outliers" in the dataset. These reconstruction attacks can be used for textitdataset distillation, that is, we can retrain on reconstructed images and obtain high predictive accuracy.
arXiv Detail & Related papers (2023-02-02T21:41:59Z)
TexPose: Neural Texture Learning for Self-Supervised 6D Object Pose Estimation [55.94900327396771]
We introduce neural texture learning for 6D object pose estimation from synthetic data. We learn to predict realistic texture of objects from real image collections. We learn pose estimation from pixel-perfect synthetic data.
arXiv Detail & Related papers (2022-12-25T13:36:32Z)
Reconstructing Training Data with Informed Adversaries [30.138217209991826]
Given access to a machine learning model, can an adversary reconstruct the model's training data? This work studies this question from the lens of a powerful informed adversary who knows all the training data points except one. We show it is feasible to reconstruct the remaining data point in this stringent threat model.
arXiv Detail & Related papers (2022-01-13T09:19:25Z)
Learning from scarce information: using synthetic data to classify Roman fine ware pottery [0.0]
We propose to use a transfer learning approach whereby the model is first trained on a synthetic dataset replicating features of the original objects. Taking the replicated features from published profile drawings of pottery forms allowed the integration of expert knowledge into the process. After this first initial training the model was fine-tuned with data from photographs of real vessels.
arXiv Detail & Related papers (2021-07-03T10:30:46Z)
A general approach to bridge the reality-gap [0.0]
A common approach to circumvent this is to leverage existing, similar data-sets with large amounts of labelled data. We propose learning a general transformation to bring arbitrary images towards a canonical distribution. This transformation is trained in an unsupervised regime, leveraging data augmentation to generate off-canonical examples of images.
arXiv Detail & Related papers (2020-09-03T18:19:28Z)
Adversarially-Trained Deep Nets Transfer Better: Illustration on Image Classification [53.735029033681435]
Transfer learning is a powerful methodology for adapting pre-trained deep neural networks on image recognition tasks to new domains. In this work, we demonstrate that adversarially-trained models transfer better than non-adversarially-trained models.
arXiv Detail & Related papers (2020-07-11T22:48:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.