GradViT: Gradient Inversion of Vision Transformers
- URL: http://arxiv.org/abs/2203.11894v2
- Date: Wed, 23 Mar 2022 18:14:11 GMT
- Title: GradViT: Gradient Inversion of Vision Transformers
- Authors: Ali Hatamizadeh, Hongxu Yin, Holger Roth, Wenqi Li, Jan Kautz, Daguang
Xu and Pavlo Molchanov
- Abstract summary: We demonstrate the vulnerability of vision transformers (ViTs) to gradient-based inversion attacks.
We introduce a method, named GradViT, that optimize random noise into naturally looking images.
We observe unprecedentedly high fidelity and closeness to the original (hidden) data.
- Score: 83.54779732309653
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In this work we demonstrate the vulnerability of vision transformers (ViTs)
to gradient-based inversion attacks. During this attack, the original data
batch is reconstructed given model weights and the corresponding gradients. We
introduce a method, named GradViT, that optimizes random noise into naturally
looking images via an iterative process. The optimization objective consists of
(i) a loss on matching the gradients, (ii) image prior in the form of distance
to batch-normalization statistics of a pretrained CNN model, and (iii) a total
variation regularization on patches to guide correct recovery locations. We
propose a unique loss scheduling function to overcome local minima during
optimization. We evaluate GadViT on ImageNet1K and MS-Celeb-1M datasets, and
observe unprecedentedly high fidelity and closeness to the original (hidden)
data. During the analysis we find that vision transformers are significantly
more vulnerable than previously studied CNNs due to the presence of the
attention mechanism. Our method demonstrates new state-of-the-art results for
gradient inversion in both qualitative and quantitative metrics. Project page
at https://gradvit.github.io/.
Related papers
- Causal Transformer for Fusion and Pose Estimation in Deep Visual Inertial Odometry [1.2289361708127877]
We propose a causal visual-inertial fusion transformer (VIFT) for pose estimation in deep visual-inertial odometry.
The proposed method is end-to-end trainable and requires only a monocular camera and IMU during inference.
arXiv Detail & Related papers (2024-09-13T12:21:25Z) - Image-level Regression for Uncertainty-aware Retinal Image Segmentation [3.7141182051230914]
We introduce a novel Uncertainty-Aware (SAUNA) transform, which adds pixel uncertainty to the ground truth.
Our results indicate that the integration of the SAUNA transform and these segmentation losses led to significant performance boosts for different segmentation models.
arXiv Detail & Related papers (2024-05-27T04:17:10Z) - PriViT: Vision Transformers for Fast Private Inference [55.36478271911595]
Vision Transformer (ViT) architecture has emerged as the backbone of choice for state-of-the-art deep models for computer vision applications.
ViTs are ill-suited for private inference using secure multi-party protocols, due to the large number of non-polynomial operations.
We propose PriViT, an algorithm to selectively " Taylorize" nonlinearities in ViTs while maintaining their prediction accuracy.
arXiv Detail & Related papers (2023-10-06T21:45:05Z) - Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of Language Model [89.8764435351222]
We propose a new family of unbiased estimators called WTA-CRS, for matrix production with reduced variance.
Our work provides both theoretical and experimental evidence that, in the context of tuning transformers, our proposed estimators exhibit lower variance compared to existing ones.
arXiv Detail & Related papers (2023-05-24T15:52:08Z) - Dataset Distillation with Convexified Implicit Gradients [69.16247946639233]
We show how implicit gradients can be effectively used to compute meta-gradient updates.
We further equip the algorithm with a convexified approximation that corresponds to learning on top of a frozen finite-width neural kernel.
arXiv Detail & Related papers (2023-02-13T23:53:16Z) - Image Restoration by Deep Projected GSURE [115.57142046076164]
Ill-posed inverse problems appear in many image processing applications, such as deblurring and super-resolution.
We propose a new image restoration framework that is based on minimizing a loss function that includes a "projected-version" of the Generalized SteinUnbiased Risk Estimator (GSURE) and parameterization of the latent image by a CNN.
arXiv Detail & Related papers (2021-02-04T08:52:46Z) - Image Inpainting with Learnable Feature Imputation [8.293345261434943]
A regular convolution layer applying a filter in the same way over known and unknown areas causes visual artifacts in the inpainted image.
We propose (layer-wise) feature imputation of the missing input values to a convolution.
We present comparisons on CelebA-HQ and Places2 to current state-of-the-art to validate our model.
arXiv Detail & Related papers (2020-11-02T16:05:32Z) - Domain-invariant Similarity Activation Map Contrastive Learning for
Retrieval-based Long-term Visual Localization [30.203072945001136]
In this work, a general architecture is first formulated probabilistically to extract domain invariant feature through multi-domain image translation.
And then a novel gradient-weighted similarity activation mapping loss (Grad-SAM) is incorporated for finer localization with high accuracy.
Extensive experiments have been conducted to validate the effectiveness of the proposed approach on the CMUSeasons dataset.
Our performance is on par with or even outperforms the state-of-the-art image-based localization baselines in medium or high precision.
arXiv Detail & Related papers (2020-09-16T14:43:22Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z) - Transformation Based Deep Anomaly Detection in Astronomical Images [0.0]
We introduce new filter based transformations useful for detecting anomalies in astronomical images.
We also propose a transformation selection strategy that allows us to find indistinguishable pairs of transformations.
The models were tested on astronomical images from the High Cadence Transient Survey (HiTS) and Zwicky Transient Facility (ZTF) datasets.
arXiv Detail & Related papers (2020-05-15T21:02:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.