TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models
- URL: http://arxiv.org/abs/2512.16523v1
- Date: Thu, 18 Dec 2025 13:34:14 GMT
- Title: TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models
- Authors: Zhiwei Li, Yitian Pang, Weining Wang, Zhenan Sun, Qi Li,
- Abstract summary: We propose Test-Time Padding (TTP), a lightweight defense framework that performs adversarial detection followed by targeted adaptation at inference.<n>TTP consistently surpasses state-of-the-art test-time defenses, delivering substantial improvements in adversarial robustness without compromising clean accuracy.
- Score: 32.85951917559796
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Vision-Language Models (VLMs), such as CLIP, have achieved impressive zero-shot recognition performance but remain highly susceptible to adversarial perturbations, posing significant risks in safety-critical scenarios. Previous training-time defenses rely on adversarial fine-tuning, which requires labeled data and costly retraining, while existing test-time strategies fail to reliably distinguish between clean and adversarial inputs, thereby preventing both adversarial robustness and clean accuracy from reaching their optimum. To address these limitations, we propose Test-Time Padding (TTP), a lightweight defense framework that performs adversarial detection followed by targeted adaptation at inference. TTP identifies adversarial inputs via the cosine similarity shift between CLIP feature embeddings computed before and after spatial padding, yielding a universal threshold for reliable detection across architectures and datasets. For detected adversarial cases, TTP employs trainable padding to restore disrupted attention patterns, coupled with a similarity-aware ensemble strategy for a more robust final prediction. For clean inputs, TTP leaves them unchanged by default or optionally integrates existing test-time adaptation techniques for further accuracy gains. Comprehensive experiments on diverse CLIP backbones and fine-grained benchmarks show that TTP consistently surpasses state-of-the-art test-time defenses, delivering substantial improvements in adversarial robustness without compromising clean accuracy. The code for this paper will be released soon.
Related papers
- Trust, Don't Trust, or Flip: Robust Preference-Based Reinforcement Learning with Multi-Expert Feedback [2.4352490146713364]
We introduce TriTrust-PBRL, a unified framework that jointly learns a shared reward model and expert-specific trust parameters from multi-expert preference feedback.<n>TTP achieves state-of-the-art robustness, maintaining near-oracle performance under adversarial corruption while standard PBRL methods fail catastrophically.
arXiv Detail & Related papers (2026-01-26T18:21:48Z) - R-TPT: Improving Adversarial Robustness of Vision-Language Models through Test-Time Prompt Tuning [69.72249695674665]
We propose a robust test-time prompt tuning (R-TPT) for vision-language models (VLMs)<n>R-TPT mitigates the impact of adversarial attacks during the inference stage.<n>We introduce a plug-and-play reliability-based weighted ensembling strategy to strengthen the defense.
arXiv Detail & Related papers (2025-04-15T13:49:31Z) - Deep Positive-Negative Prototypes for Adversarially Robust Discriminative Prototypical Learning [0.24999074238880484]
We propose a novel framework named Adversarially trained Deep Positive-Negative Prototypes (Adv-DPNP)<n>Adv-DPNP integrates discriminative prototype-based learning with adversarial training.<n>We show that Adv-DPNP achieves the highest average accuracy across severities and corruption types.
arXiv Detail & Related papers (2025-04-03T15:42:58Z) - CLIP is Strong Enough to Fight Back: Test-time Counterattacks towards Zero-shot Adversarial Robustness of CLIP [54.660471826755234]
We show that malicious perturbations that seek to maximise the classification loss lead to falsely stable' images.<n>We propose to leverage the pre-trained vision encoder of CLIP to counterattack such adversarial images during inference to achieve robustness.<n>Our paradigm is simple and training-free, providing the first method to defend CLIP from adversarial attacks at test time.
arXiv Detail & Related papers (2025-03-05T15:51:59Z) - TAPT: Test-Time Adversarial Prompt Tuning for Robust Inference in Vision-Language Models [53.91006249339802]
We propose a novel defense method called Test-Time Adversarial Prompt Tuning (TAPT) to enhance the inference robustness of CLIP against visual adversarial attacks.
TAPT is a test-time defense method that learns defensive bimodal (textual and visual) prompts to robustify the inference process of CLIP.
We evaluate the effectiveness of TAPT on 11 benchmark datasets, including ImageNet and 10 other zero-shot datasets.
arXiv Detail & Related papers (2024-11-20T08:58:59Z) - The Pitfalls and Promise of Conformal Inference Under Adversarial Attacks [90.52808174102157]
In safety-critical applications such as medical imaging and autonomous driving, it is imperative to maintain both high adversarial robustness to protect against potential adversarial attacks.
A notable knowledge gap remains concerning the uncertainty inherent in adversarially trained models.
This study investigates the uncertainty of deep learning models by examining the performance of conformal prediction (CP) in the context of standard adversarial attacks.
arXiv Detail & Related papers (2024-05-14T18:05:19Z) - Uncertainty-Calibrated Test-Time Model Adaptation without Forgetting [65.21599711087538]
Test-time adaptation (TTA) seeks to tackle potential distribution shifts between training and test data by adapting a given model w.r.t. any test sample.<n>Prior methods perform backpropagation for each test sample, resulting in unbearable optimization costs to many applications.<n>We propose an Efficient Anti-Forgetting Test-Time Adaptation (EATA) method which develops an active sample selection criterion to identify reliable and non-redundant samples.
arXiv Detail & Related papers (2024-03-18T05:49:45Z) - Adversarial Attacks and Defense for Non-Parametric Two-Sample Tests [73.32304304788838]
This paper systematically uncovers the failure mode of non-parametric TSTs through adversarial attacks.
To enable TST-agnostic attacks, we propose an ensemble attack framework that jointly minimizes the different types of test criteria.
To robustify TSTs, we propose a max-min optimization that iteratively generates adversarial pairs to train the deep kernels.
arXiv Detail & Related papers (2022-02-07T11:18:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.