Towards Calibrated Losses for Adversarial Robust Reject Option Classification
- URL: http://arxiv.org/abs/2410.10736v1
- Date: Mon, 14 Oct 2024 17:17:04 GMT
- Title: Towards Calibrated Losses for Adversarial Robust Reject Option Classification
- Authors: Vrund Shah, Tejas Chaudhari, Naresh Manwani,
- Abstract summary: This paper aims to characterize and design surrogates calibrated in "Adrial Robust Reject Option" setting.
We provide a complete characterization result for any surrogate to be $(ell_dgamma,mathcalH_textrmlin)$- calibrated.
- Score: 3.263508275825069
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robustness towards adversarial attacks is a vital property for classifiers in several applications such as autonomous driving, medical diagnosis, etc. Also, in such scenarios, where the cost of misclassification is very high, knowing when to abstain from prediction becomes crucial. A natural question is which surrogates can be used to ensure learning in scenarios where the input points are adversarially perturbed and the classifier can abstain from prediction? This paper aims to characterize and design surrogates calibrated in "Adversarial Robust Reject Option" setting. First, we propose an adversarial robust reject option loss $\ell_{d}^{\gamma}$ and analyze it for the hypothesis set of linear classifiers ($\mathcal{H}_{\textrm{lin}}$). Next, we provide a complete characterization result for any surrogate to be $(\ell_{d}^{\gamma},\mathcal{H}_{\textrm{lin}})$- calibrated. To demonstrate the difficulty in designing surrogates to $\ell_{d}^{\gamma}$, we show negative calibration results for convex surrogates and quasi-concave conditional risk cases (these gave positive calibration in adversarial setting without reject option). We also empirically argue that Shifted Double Ramp Loss (DRL) and Shifted Double Sigmoid Loss (DSL) satisfy the calibration conditions. Finally, we demonstrate the robustness of shifted DRL and shifted DSL against adversarial perturbations on a synthetically generated dataset.
Related papers
- Decision from Suboptimal Classifiers: Excess Risk Pre- and Post-Calibration [52.70324949884702]
We quantify the excess risk incurred using approximate posterior probabilities in batch binary decision-making.
We identify regimes where recalibration alone addresses most of the regret, and regimes where the regret is dominated by the grouping loss.
On NLP experiments, we show that these quantities identify when the expected gain of more advanced post-training is worth the operational cost.
arXiv Detail & Related papers (2025-03-23T10:52:36Z) - Indiscriminate Disruption of Conditional Inference on Multivariate Gaussians [60.22542847840578]
Despite advances in adversarial machine learning, inference for Gaussian models in the presence of an adversary is notably understudied.
We consider a self-interested attacker who wishes to disrupt a decisionmaker's conditional inference and subsequent actions by corrupting a set of evidentiary variables.
To avoid detection, the attacker also desires the attack to appear plausible wherein plausibility is determined by the density of the corrupted evidence.
arXiv Detail & Related papers (2024-11-21T17:46:55Z) - Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization [60.176008034221404]
Direct Preference Optimization (DPO) and its variants are increasingly used for aligning language models with human preferences.
Prior work has observed that the likelihood of preferred responses often decreases during training.
We demonstrate that likelihood displacement can be catastrophic, shifting probability mass from preferred responses to responses with an opposite meaning.
arXiv Detail & Related papers (2024-10-11T14:22:44Z) - Orthogonal Causal Calibration [55.28164682911196]
We develop general algorithms for reducing the task of causal calibration to that of calibrating a standard (non-causal) predictive model.
Our results are exceedingly general, showing that essentially any existing calibration algorithm can be used in causal settings.
arXiv Detail & Related papers (2024-06-04T03:35:25Z) - Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Minimum-Risk Recalibration of Classifiers [9.31067660373791]
We introduce the concept of minimum-risk recalibration within the framework of mean-squared-error decomposition.
We show that transferring a calibrated classifier requires significantly fewer target samples compared to recalibrating from scratch.
arXiv Detail & Related papers (2023-05-18T11:27:02Z) - Decision-Making under Miscalibration [14.762226638396209]
ML-based predictions are used to inform consequential decisions about individuals.
We formalize a natural (distribution-free) solution concept: given anticipated miscalibration of $alpha$, we propose using the threshold $j$ that minimizes the worst-case regret.
We provide closed form expressions for $j$ when miscalibration is measured using both expected and maximum calibration error.
We validate our theoretical findings on real data, demonstrating that there are natural cases in which making decisions using $j$ improves the clinical utility.
arXiv Detail & Related papers (2022-03-18T10:44:11Z) - Calibration and Consistency of Adversarial Surrogate Losses [46.04004505351902]
Adrialversa robustness is an increasingly critical property of classifiers in applications.
But which surrogate losses should be used and when do they benefit from theoretical guarantees?
We present an extensive study of this question, including a detailed analysis of the H-calibration and H-consistency of adversarial surrogate losses.
arXiv Detail & Related papers (2021-04-19T21:58:52Z) - Robust Classification Under $\ell_0$ Attack for the Gaussian Mixture
Model [39.414875342234204]
We develop a novel classification algorithm called FilTrun that has two main modules: filtration and Truncation.
We discuss several examples that illustrate interesting behaviors such as a phase transition for adversary's budget determining whether the effect of adversarial perturbation can be fully neutralized.
arXiv Detail & Related papers (2021-04-05T23:31:25Z) - Almost Tight L0-norm Certified Robustness of Top-k Predictions against
Adversarial Perturbations [78.23408201652984]
Top-k predictions are used in many real-world applications such as machine learning as a service, recommender systems, and web searches.
Our work is based on randomized smoothing, which builds a provably robust classifier via randomizing an input.
For instance, our method can build a classifier that achieves a certified top-3 accuracy of 69.2% on ImageNet when an attacker can arbitrarily perturb 5 pixels of a testing image.
arXiv Detail & Related papers (2020-11-15T21:34:44Z) - Provable tradeoffs in adversarially robust classification [96.48180210364893]
We develop and leverage new tools, including recent breakthroughs from probability theory on robust isoperimetry.
Our results reveal fundamental tradeoffs between standard and robust accuracy that grow when data is imbalanced.
arXiv Detail & Related papers (2020-06-09T09:58:19Z) - Calibrated Surrogate Losses for Adversarially Robust Classification [92.37268323142307]
We show that no convex surrogate loss is respect with respect to adversarial 0-1 loss when restricted to linear models.
We also show that if the underlying distribution satisfies the Massart's noise condition, convex losses can also be calibrated in the adversarial setting.
arXiv Detail & Related papers (2020-05-28T02:40:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.