CLIPure: Purification in Latent Space via CLIP for Adversarially Robust Zero-Shot Classification
- URL: http://arxiv.org/abs/2502.18176v2
- Date: Sun, 02 Mar 2025 09:22:47 GMT
- Title: CLIPure: Purification in Latent Space via CLIP for Adversarially Robust Zero-Shot Classification
- Authors: Mingkun Zhang, Keping Bi, Wei Chen, Jiafeng Guo, Xueqi Cheng,
- Abstract summary: We ground our work on CLIP, a vision-language pre-trained encoder model that can perform zero-shot classification by matching an image with text prompts.<n>We then formulate purification risk as the KL divergence between the joint distributions purification process.<n>We propose two variants for our CLIPure approach: CLI-Diff which models the likelihood of images' latent vectors, and CLIPure-Cos which models the likelihood with the cosine similarity between the embeddings of an image and a photo of a.''
- Score: 65.46685389276443
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we aim to build an adversarially robust zero-shot image classifier. We ground our work on CLIP, a vision-language pre-trained encoder model that can perform zero-shot classification by matching an image with text prompts ``a photo of a <class-name>.''. Purification is the path we choose since it does not require adversarial training on specific attack types and thus can cope with any foreseen attacks. We then formulate purification risk as the KL divergence between the joint distributions of the purification process of denoising the adversarial samples and the attack process of adding perturbations to benign samples, through bidirectional Stochastic Differential Equations (SDEs). The final derived results inspire us to explore purification in the multi-modal latent space of CLIP. We propose two variants for our CLIPure approach: CLIPure-Diff which models the likelihood of images' latent vectors with the DiffusionPrior module in DaLLE-2 (modeling the generation process of CLIP's latent vectors), and CLIPure-Cos which models the likelihood with the cosine similarity between the embeddings of an image and ``a photo of a.''. As far as we know, CLIPure is the first purification method in multi-modal latent space and CLIPure-Cos is the first purification method that is not based on generative models, which substantially improves defense efficiency. We conducted extensive experiments on CIFAR-10, ImageNet, and 13 datasets that previous CLIP-based defense methods used for evaluating zero-shot classification robustness. Results show that CLIPure boosts the SOTA robustness by a large margin, e.g., from 71.7% to 91.1% on CIFAR10, from 59.6% to 72.6% on ImageNet, and 108% relative improvements of average robustness on the 13 datasets over previous SOTA. The code is available at https://github.com/TMLResearchGroup-CAS/CLIPure.
Related papers
- DeeCLIP: A Robust and Generalizable Transformer-Based Framework for Detecting AI-Generated Images [14.448350657613368]
DeeCLIP is a novel framework for detecting AI-generated images.
It incorporates DeeFuser, a fusion module that combines high-level and low-level features.
We trained exclusively on 4-class ProGAN data, DeeCLIP achieves an average accuracy of 89.90%.
arXiv Detail & Related papers (2025-04-28T15:06:28Z) - ZeroPur: Succinct Training-Free Adversarial Purification [52.963392510839284]
Adversarial purification is a kind of defense computation technique that can defend various unseen adversarial attacks.
We present a simple adversarial purification method without further training to purify adversarial images, called ZeroPur.
arXiv Detail & Related papers (2024-06-05T10:58:15Z) - Transductive Zero-Shot and Few-Shot CLIP [24.592841797020203]
This paper addresses the transductive zero-shot and few-shot CLIP classification challenge.
Inference is performed jointly across a mini-batch of unlabeled query samples, rather than treating each instance independently.
Our approach yields near 20% improvement in ImageNet accuracy over CLIP's zero-shot performance.
arXiv Detail & Related papers (2024-04-08T12:44:31Z) - RanPAC: Random Projections and Pre-trained Models for Continual Learning [59.07316955610658]
Continual learning (CL) aims to learn different tasks (such as classification) in a non-stationary data stream without forgetting old ones.
We propose a concise and effective approach for CL with pre-trained models.
arXiv Detail & Related papers (2023-07-05T12:49:02Z) - ContraCluster: Learning to Classify without Labels by Contrastive
Self-Supervision and Prototype-Based Semi-Supervision [7.819942809508631]
We propose ContraCluster, an unsupervised image classification method that combines clustering with the power of contrastive self-supervised learning.
ContraCluster consists of three stages: (1) contrastive self-supervised pre-training (CPT), (2) contrastive prototype sampling (CPS), and (3) prototype-based semi-supervised fine-tuning (PB-SFT).
We demonstrate empirically that ContraCluster achieves new state-of-the-art results for standard benchmark datasets including CIFAR-10, STL-10, and ImageNet-10.
arXiv Detail & Related papers (2023-04-19T01:51:08Z) - (Certified!!) Adversarial Robustness for Free! [116.6052628829344]
We certify 71% accuracy on ImageNet under adversarial perturbations constrained to be within a 2-norm of 0.5.
We obtain these results using only pretrained diffusion models and image classifiers, without requiring any fine tuning or retraining of model parameters.
arXiv Detail & Related papers (2022-06-21T17:27:27Z) - Diffusion Models for Adversarial Purification [69.1882221038846]
Adrial purification refers to a class of defense methods that remove adversarial perturbations using a generative model.
We propose DiffPure that uses diffusion models for adversarial purification.
Our method achieves the state-of-the-art results, outperforming current adversarial training and adversarial purification methods.
arXiv Detail & Related papers (2022-05-16T06:03:00Z) - ReCLIP: A Strong Zero-Shot Baseline for Referring Expression
Comprehension [114.85628613911713]
Large-scale pre-trained models are useful for image classification across domains.
We present ReCLIP, a simple but strong zero-shot baseline that repurposes CLIP, a state-of-the-art large-scale model, for ReC.
arXiv Detail & Related papers (2022-04-12T17:55:38Z) - Adaptive Clustering of Robust Semantic Representations for Adversarial
Image Purification [0.9203366434753543]
We propose a robust defense against adversarial attacks, which is model agnostic and generalizable to unseen adversaries.
In this paper, we extract the latent representations for each class and adaptively cluster the latent representations that share a semantic similarity.
We adversarially train a new model constraining the latent space representation to minimize the distance between the adversarial latent representation and the true cluster distribution.
arXiv Detail & Related papers (2021-04-05T21:07:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.