Related papers: Zero-Shot Robustness of Vision Language Models Via Confidence-Aware Weighting

Zero-Shot Robustness of Vision Language Models Via Confidence-Aware Weighting

URL: http://arxiv.org/abs/2510.02913v1
Date: Fri, 03 Oct 2025 11:36:02 GMT
Title: Zero-Shot Robustness of Vision Language Models Via Confidence-Aware Weighting
Authors: Nikoo Naghavian, Mostafa Tavassolipour,
Abstract summary: We propose Confidence-Aware Weighting (CAW) to enhance zero-shot robustness in vision-language models.<n>CAW consists of two components: (1) a Confidence-Aware loss that prioritizes uncertain adversarial examples by scaling the KL divergence between clean and adversarial predictions, and (2) a feature alignment regularization that preserves semantic consistency.
Score: 1.5268922363885407
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Vision-language models like CLIP demonstrate impressive zero-shot generalization but remain highly vulnerable to adversarial attacks. In this work, we propose Confidence-Aware Weighting (CAW) to enhance zero-shot robustness in vision-language models. CAW consists of two components: (1) a Confidence-Aware loss that prioritizes uncertain adversarial examples by scaling the KL divergence between clean and adversarial predictions, and (2) a feature alignment regularization that preserves semantic consistency by minimizing the distance between frozen and fine-tuned image encoder features on adversarial inputs. These components work jointly to improve both clean and robust accuracy without sacrificing generalization. Extensive experiments on TinyImageNet and 14 additional datasets show that CAW outperforms recent methods such as PMG-AFT and TGA-ZSR under strong attacks like AutoAttack, while using less memory.

Related papers

Enhancing CLIP Robustness via Cross-Modality Alignment [54.01929554563447]
We propose Cross-modality Alignment, an optimal transport-based framework for vision-language models.<n> COLA restores global image-text alignment and local structural consistency in the feature space.<n> COLA is training-free and compatible with existing fine-tuned models.
arXiv Detail & Related papers (2025-10-28T03:47:44Z)
Self-Calibrated Consistency can Fight Back for Adversarial Robustness in Vision-Language Models [31.920092341939593]
Self-Calibrated Consistency is an effective test-time defense against adversarial attacks.<n> SCC consistently improves the zero-shot robustness of CLIP while maintaining accuracy.<n>These findings highlight the great potential of establishing an adversarially robust paradigm from CLIP.
arXiv Detail & Related papers (2025-10-26T18:37:12Z)
CLIP is Strong Enough to Fight Back: Test-time Counterattacks towards Zero-shot Adversarial Robustness of CLIP [54.660471826755234]
We show that malicious perturbations that seek to maximise the classification loss lead to falsely stable' images.<n>We propose to leverage the pre-trained vision encoder of CLIP to counterattack such adversarial images during inference to achieve robustness.<n>Our paradigm is simple and training-free, providing the first method to defend CLIP from adversarial attacks at test time.
arXiv Detail & Related papers (2025-03-05T15:51:59Z)
Robust-LLaVA: On the Effectiveness of Large-Scale Robust Image Encoders for Multi-modal Large Language Models [26.656858396343726]
Multi-modal Large Language Models (MLLMs) excel in vision-language tasks but remain vulnerable to visual adversarial perturbations.<n>Existing methods seek to mitigate these risks by applying constrained adversarial fine-tuning to CLIP vision encoders on ImageNet-scale data.<n>We explore an alternative approach of leveraging existing vision classification models that have been adversarially pre-trained on large-scale data.
arXiv Detail & Related papers (2025-02-03T17:59:45Z)
Confidence-aware Denoised Fine-tuning of Off-the-shelf Models for Certified Robustness [56.2479170374811]
We introduce Fine-Tuning with Confidence-Aware Denoised Image Selection (FT-CADIS) FT-CADIS is inspired by the observation that the confidence of off-the-shelf classifiers can effectively identify hallucinated images during denoised smoothing. It has established the state-of-the-art certified robustness among denoised smoothing methods across all $ell$-adversary radius in various benchmarks.
arXiv Detail & Related papers (2024-11-13T09:13:20Z)
Text-Guided Attention is All You Need for Zero-Shot Robustness in Vision-Language Models [64.67721492968941]
We propose a Text-Guided Attention for Zero-Shot Robustness (TGA-ZSR) framework. Our goal is to maintain the generalization of the CLIP model and enhance its adversarial robustness. Our method yields a 9.58% enhancement in zero-shot robust accuracy over the current state-of-the-art techniques.
arXiv Detail & Related papers (2024-10-29T07:15:09Z)
TIMA: Text-Image Mutual Awareness for Balancing Zero-Shot Adversarial Robustness and Generalization Ability [8.896239176376488]
This work addresses the challenge of achieving zero-shot adversarial robustness while preserving zero-shot generalization in large-scale foundation models. We propose a novel Text-Image Mutual Awareness (TIMA) method that strikes a balance between zero-shot adversarial robustness and generalization.
arXiv Detail & Related papers (2024-05-27T22:10:17Z)
FACTUAL: A Novel Framework for Contrastive Learning Based Robust SAR Image Classification [10.911464455072391]
FACTUAL is a Contrastive Learning framework for Adversarial Training and robust SAR classification. Our model achieves 99.7% accuracy on clean samples, and 89.6% on perturbed samples, both outperforming previous state-of-the-art methods.
arXiv Detail & Related papers (2024-04-04T06:20:22Z)
Understanding Zero-Shot Adversarial Robustness for Large-Scale Models [31.295249927085475]
We identify and explore the problem of emphadapting large-scale models for zero-shot adversarial robustness. We propose a text-guided contrastive adversarial training loss, which aligns the text embeddings and the adversarial visual features with contrastive learning. Our approach significantly improves the zero-shot adversarial robustness over CLIP, seeing an average improvement of over 31 points over ImageNet and 15 zero-shot datasets.
arXiv Detail & Related papers (2022-12-14T04:08:56Z)
When Does Contrastive Learning Preserve Adversarial Robustness from Pretraining to Finetuning? [99.4914671654374]
We propose AdvCL, a novel adversarial contrastive pretraining framework. We show that AdvCL is able to enhance cross-task robustness transferability without loss of model accuracy and finetuning efficiency.
arXiv Detail & Related papers (2021-11-01T17:59:43Z)
Robust Pre-Training by Adversarial Contrastive Learning [120.33706897927391]
Recent work has shown that, when integrated with adversarial training, self-supervised pre-training can lead to state-of-the-art robustness. We improve robustness-aware self-supervised pre-training by learning representations consistent under both data augmentations and adversarial perturbations.
arXiv Detail & Related papers (2020-10-26T04:44:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.