Randomized Smoothing Meets Vision-Language Models
- URL: http://arxiv.org/abs/2509.16088v1
- Date: Fri, 19 Sep 2025 15:33:22 GMT
- Title: Randomized Smoothing Meets Vision-Language Models
- Authors: Emmanouil Seferis, Changshun Wu, Stefanos Kollias, Saddek Bensalem, Chih-Hong Cheng,
- Abstract summary: Randomized smoothing (RS) is used to ensure correctness of machine learning models.<n>We show that RS can still be enabled for generative models.<n>We derive improved scaling laws analytically relating the certified radius and accuracy to the number of samples.<n>These advances make robustness certification both well-defined and computationally feasible for state-of-the-art VLMs.
- Score: 6.224335082856828
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Randomized smoothing (RS) is one of the prominent techniques to ensure the correctness of machine learning models, where point-wise robustness certificates can be derived analytically. While RS is well understood for classification, its application to generative models is unclear, since their outputs are sequences rather than labels. We resolve this by connecting generative outputs to an oracle classification task and showing that RS can still be enabled: the final response can be classified as a discrete action (e.g., service-robot commands in VLAs), as harmful vs. harmless (content moderation or toxicity detection in VLMs), or even applying oracles to cluster answers into semantically equivalent ones. Provided that the error rate for the oracle classifier comparison is bounded, we develop the theory that associates the number of samples with the corresponding robustness radius. We further derive improved scaling laws analytically relating the certified radius and accuracy to the number of samples, showing that the earlier result of 2 to 3 orders of magnitude fewer samples sufficing with minimal loss remains valid even under weaker assumptions. Together, these advances make robustness certification both well-defined and computationally feasible for state-of-the-art VLMs, as validated against recent jailbreak-style adversarial attacks.
Related papers
- ProtoDCS: Towards Robust and Efficient Open-Set Test-Time Adaptation for Vision-Language Models [32.840734752367275]
Prototype-based Double-Check Separation (ProtoDCS) is a robust framework for OSTTA.<n>It separates csID and csOOD samples, enabling safe and efficient adaptation of Vision-Language Models to csID data.<n>ProtoDCS significantly boosts both known-class accuracy and OOD detection metrics.
arXiv Detail & Related papers (2026-02-27T03:39:02Z) - Calibratable Disambiguation Loss for Multi-Instance Partial-Label Learning [53.9713678229744]
Multi-instance partial-label learning (MIPL) is a weakly supervised framework that addresses the challenges of inexact supervision in both instance and label spaces.<n>Existing MIPL approaches often suffer from poor calibration, undermining reliability.<n>We propose a plug-and-play calibratable disambiguation loss (CDL) that simultaneously improves classification accuracy and calibration performance.
arXiv Detail & Related papers (2025-12-19T16:58:31Z) - Sharpness-aware Dynamic Anchor Selection for Generalized Category Discovery [61.694524826522205]
Given some labeled data of known classes, GCD aims to cluster unlabeled data that contain both known and unknown classes.<n>Large pre-trained models have a preference for some specific visual patterns, resulting in encoding spurious correlation for unlabeled data.<n>We propose a novel method, which contains two modules: Loss Sharpness Penalty (LSP) and Dynamic Anchor Selection (DAS)
arXiv Detail & Related papers (2025-12-15T02:24:06Z) - What Does It Take to Build a Performant Selective Classifier? [30.90225954725644]
Bayes noise, approximation error, ranking error, statistical noise, and implementation- or shift-induced slack are studied.<n>We validate our decomposition on synthetic two-moons data and on real-world vision and language benchmarks.<n>Our results confirm that (i) Bayes noise and limited model capacity can account for substantial gaps, (ii) only richer, feature-aware calibrators meaningfully improve score ordering, and (iii) data shift introduces a separate slack that demands distributionally robust training.
arXiv Detail & Related papers (2025-10-23T05:48:40Z) - FADEL: Uncertainty-aware Fake Audio Detection with Evidential Deep Learning [9.960675988638805]
We propose a novel framework called fake audio detection with evidential learning (FADEL)<n>FADEL incorporates model uncertainty into its predictions, thereby leading to more robust performance in OOD scenarios.<n>We demonstrate the validity of uncertainty estimation by analyzing a strong correlation between average uncertainty and equal error rate (EER) across different spoofing algorithms.
arXiv Detail & Related papers (2025-04-22T07:40:35Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - The Lipschitz-Variance-Margin Tradeoff for Enhanced Randomized Smoothing [85.85160896547698]
Real-life applications of deep neural networks are hindered by their unsteady predictions when faced with noisy inputs and adversarial attacks.
We show how to design an efficient classifier with a certified radius by relying on noise injection into the inputs.
Our novel certification procedure allows us to use pre-trained models with randomized smoothing, effectively improving the current certification radius in a zero-shot manner.
arXiv Detail & Related papers (2023-09-28T22:41:47Z) - Threshold-Consistent Margin Loss for Open-World Deep Metric Learning [42.03620337000911]
Existing losses used in deep metric learning (DML) for image retrieval often lead to highly non-uniform intra-class and inter-class representation structures.
Inconsistency often complicates the threshold selection process when deploying commercial image retrieval systems.
We propose a novel variance-based metric called Operating-Point-Inconsistency-Score (OPIS) that quantifies the variance in the operating characteristics across classes.
arXiv Detail & Related papers (2023-07-08T21:16:41Z) - RS-Del: Edit Distance Robustness Certificates for Sequence Classifiers
via Randomized Deletion [23.309600117618025]
We adapt randomized smoothing for discrete sequence classifiers to provide certified robustness against edit distance-bounded adversaries.
Our proof of certification deviates from the established Neyman-Pearson approach, which is intractable in our setting, and is instead organized around longest common subsequences.
When applied to the popular MalConv malware detection model, our smoothing mechanism RS-Del achieves a certified accuracy of 91% at an edit distance radius of 128 bytes.
arXiv Detail & Related papers (2023-01-31T01:40:26Z) - Neighbour Consistency Guided Pseudo-Label Refinement for Unsupervised
Person Re-Identification [80.98291772215154]
Unsupervised person re-identification (ReID) aims at learning discriminative identity features for person retrieval without any annotations.
Recent advances accomplish this task by leveraging clustering-based pseudo labels.
We propose a Neighbour Consistency guided Pseudo Label Refinement framework.
arXiv Detail & Related papers (2022-11-30T09:39:57Z) - Robust Neural Network Classification via Double Regularization [2.41710192205034]
We propose a novel double regularization of the neural network training loss that combines a penalty on the complexity of the classification model and an optimal reweighting of training observations.
We demonstrate DRFit, for neural net classification of (i) MNIST and (ii) CIFAR-10, in both cases with simulated mislabeling.
arXiv Detail & Related papers (2021-12-15T13:19:20Z) - Certified Robustness to Label-Flipping Attacks via Randomized Smoothing [105.91827623768724]
Machine learning algorithms are susceptible to data poisoning attacks.
We present a unifying view of randomized smoothing over arbitrary functions.
We propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.
arXiv Detail & Related papers (2020-02-07T21:28:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.