Related papers: Calibrating Uncertainty for Zero-Shot Adversarial CLIP

Calibrating Uncertainty for Zero-Shot Adversarial CLIP

URL: http://arxiv.org/abs/2512.12997v1
Date: Mon, 15 Dec 2025 05:41:08 GMT
Title: Calibrating Uncertainty for Zero-Shot Adversarial CLIP
Authors: Wenjing lu, Zerui Tao, Dongping Zhang, Yuning Qiu, Yang Yang, Qibin Zhao,
Abstract summary: We propose a novel adversarial fine-tuning objective for CLIP considering both prediction accuracy and uncertainty alignments.<n>Our objective aligns these distributions holistically under perturbations, moving beyond single-logit anchoring and restoring uncertainty.
Score: 33.707647228637114
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: CLIP delivers strong zero-shot classification but remains highly vulnerable to adversarial attacks. Previous work of adversarial fine-tuning largely focuses on matching the predicted logits between clean and adversarial examples, which overlooks uncertainty calibration and may degrade the zero-shot generalization. A common expectation in reliable uncertainty estimation is that predictive uncertainty should increase as inputs become more difficult or shift away from the training distribution. However, we frequently observe the opposite in the adversarial setting: perturbations not only degrade accuracy but also suppress uncertainty, leading to severe miscalibration and unreliable over-confidence. This overlooked phenomenon highlights a critical reliability gap beyond robustness. To bridge this gap, we propose a novel adversarial fine-tuning objective for CLIP considering both prediction accuracy and uncertainty alignments. By reparameterizing the output of CLIP as the concentration parameter of a Dirichlet distribution, we propose a unified representation that captures relative semantic structure and the magnitude of predictive confidence. Our objective aligns these distributions holistically under perturbations, moving beyond single-logit anchoring and restoring calibrated uncertainty. Experiments on multiple zero-shot classification benchmarks demonstrate that our approach effectively restores calibrated uncertainty and achieves competitive adversarial robustness while maintaining clean accuracy.

Related papers

Thinking by Subtraction: Confidence-Driven Contrastive Decoding for LLM Reasoning [58.331709210563616]
Thinking by Subtraction is a confidence-driven contrastive decoding approach.<n>A small subset of low-confidence tokens disproportionately contributes to reasoning errors and unnecessary output expansion.<n>Our method, Confidence-Driven Contrastive Decoding, detects low-confidence tokens during decoding and intervenes at these positions.
arXiv Detail & Related papers (2026-02-20T14:13:22Z)
Benchmarking Corruption Robustness of LVLMs: A Discriminative Benchmark and Robustness Alignment Metric [49.393713730706445]
We introduce Bench-C, a benchmark emphasizing discriminative samples for assessing corruption robustness.<n>We propose the Robustness Alignment Score (RAS), a unified metric that measures degradation in logit-level prediction structure.
arXiv Detail & Related papers (2025-11-24T12:07:56Z)
Calibrated Decomposition of Aleatoric and Epistemic Uncertainty in Deep Features for Inference-Time Adaptation [3.018583625592182]
Most estimators collapse all uncertainty modes into a single confidence score, preventing reliable reasoning about when to allocate more compute or adjust inference.<n>We introduce Uncertainty-Guided Inference-Time Selection, a lightweight inference time framework that disentangles aleatoric (data-driven) and model-driven uncertainty directly in deep feature space.
arXiv Detail & Related papers (2025-11-15T23:47:30Z)
Uncertainty-Aware Post-Hoc Calibration: Mitigating Confidently Incorrect Predictions Beyond Calibration Metrics [6.9681910774977815]
This paper presents a post-hoc calibration framework to enhance calibration quality and uncertainty-aware decision-making.<n>A comprehensive evaluation is conducted using calibration metrics, uncertainty-aware performance measures, and empirical conformal coverage.<n> Experiments show that the proposed method achieves lower confidently incorrect predictions, and competitive Expected Error compared with isotonic and focal-loss baselines.
arXiv Detail & Related papers (2025-10-19T23:55:36Z)
Uncertainty-Driven Reliability: Selective Prediction and Trustworthy Deployment in Modern Machine Learning [1.2183405753834562]
This thesis investigates how uncertainty estimation can enhance the safety and trustworthiness of machine learning (ML) systems.<n>We first show that a model's training trajectory contains rich uncertainty signals that can be exploited without altering its architecture or loss.<n>We propose a lightweight, post-hoc abstention method that works across tasks, avoids the cost of deep ensembles, and achieves state-of-the-art selective prediction performance.
arXiv Detail & Related papers (2025-08-11T02:33:53Z)
Probabilistic Modeling of Disparity Uncertainty for Robust and Efficient Stereo Matching [61.73532883992135]
We propose a new uncertainty-aware stereo matching framework.<n>We adopt Bayes risk as the measurement of uncertainty and use it to separately estimate data and model uncertainty.
arXiv Detail & Related papers (2024-12-24T23:28:20Z)
Integrating uncertainty quantification into randomized smoothing based robustness guarantees [18.572496359670797]
Deep neural networks are vulnerable to adversarial attacks which can cause hazardous incorrect predictions in safety-critical applications. Certified robustness via randomized smoothing gives a probabilistic guarantee that the smoothed classifier's predictions will not change within an $ell$-ball around a given input. Uncertainty-based rejection is a technique often applied in practice to defend models against adversarial attacks. We demonstrate, that the novel framework allows for a systematic evaluation of different network architectures and uncertainty measures.
arXiv Detail & Related papers (2024-10-27T13:07:43Z)
Provably Reliable Conformal Prediction Sets in the Presence of Data Poisoning [53.42244686183879]
Conformal prediction provides model-agnostic and distribution-free uncertainty quantification.<n>Yet, conformal prediction is not reliable under poisoning attacks where adversaries manipulate both training and calibration data.<n>We propose reliable prediction sets (RPS): the first efficient method for constructing conformal prediction sets with provable reliability guarantees under poisoning.
arXiv Detail & Related papers (2024-10-13T15:37:11Z)
FairlyUncertain: A Comprehensive Benchmark of Uncertainty in Algorithmic Fairness [4.14360329494344]
We introduce FairlyUncertain, an axiomatic benchmark for evaluating uncertainty estimates in fairness. Our benchmark posits that fair predictive uncertainty estimates should be consistent across learning pipelines and calibrated to observed randomness.
arXiv Detail & Related papers (2024-10-02T20:15:29Z)
Revisiting Confidence Estimation: Towards Reliable Failure Prediction [53.79160907725975]
We find a general, widely existing but actually-neglected phenomenon that most confidence estimation methods are harmful for detecting misclassification errors. We propose to enlarge the confidence gap by finding flat minima, which yields state-of-the-art failure prediction performance.
arXiv Detail & Related papers (2024-03-05T11:44:14Z)
Towards Calibrated Deep Clustering Network [60.71776081164377]
In deep clustering, the estimated confidence for a sample belonging to a particular cluster greatly exceeds its actual prediction accuracy.<n>We propose a novel dual head (calibration head and clustering head) deep clustering model that can effectively calibrate the estimated confidence and the actual accuracy.<n>The proposed calibrated deep clustering model not only surpasses the state-of-the-art deep clustering methods by 5x on average in terms of expected calibration error, but also significantly outperforms them in terms of clustering accuracy.
arXiv Detail & Related papers (2024-03-04T11:23:40Z)
Beyond calibration: estimating the grouping loss of modern neural networks [68.8204255655161]
Proper scoring rule theory shows that given the calibration loss, the missing piece to characterize individual errors is the grouping loss. We show that modern neural network architectures in vision and NLP exhibit grouping loss, notably in distribution shifts settings.
arXiv Detail & Related papers (2022-10-28T07:04:20Z)
Robustness and Accuracy Could Be Reconcilable by (Proper) Definition [109.62614226793833]
The trade-off between robustness and accuracy has been widely studied in the adversarial literature. We find that it may stem from the improperly defined robust error, which imposes an inductive bias of local invariance. By definition, SCORE facilitates the reconciliation between robustness and accuracy, while still handling the worst-case uncertainty.
arXiv Detail & Related papers (2022-02-21T10:36:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.