Related papers: Average Certified Radius is a Poor Metric for Randomized Smoothing

Average Certified Radius is a Poor Metric for Randomized Smoothing

URL: http://arxiv.org/abs/2410.06895v1
Date: Wed, 9 Oct 2024 13:58:41 GMT
Title: Average Certified Radius is a Poor Metric for Randomized Smoothing
Authors: Chenhao Sun, Yuhao Mao, Mark Niklas Müller, Martin Vechev,
Abstract summary: We show that the average certified radius (ACR) is an exceptionally poor metric for evaluating robustness guarantees provided by randomized smoothing. We show that ACR is much more sensitive to improvements on easy samples than on hard ones.
Score: 7.960121888896864
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Randomized smoothing is a popular approach for providing certified robustness guarantees against adversarial attacks, and has become a very active area of research. Over the past years, the average certified radius (ACR) has emerged as the single most important metric for comparing methods and tracking progress in the field. However, in this work, we show that ACR is an exceptionally poor metric for evaluating robustness guarantees provided by randomized smoothing. We theoretically show not only that a trivial classifier can have arbitrarily large ACR, but also that ACR is much more sensitive to improvements on easy samples than on hard ones. Empirically, we confirm that existing training strategies that improve ACR reduce the model's robustness on hard samples. Further, we show that by focusing on easy samples, we can effectively replicate the increase in ACR. We develop strategies, including explicitly discarding hard samples, reweighing the dataset with certified radius, and extreme optimization for easy samples, to achieve state-of-the-art ACR, although these strategies ignore robustness for the general data distribution. Overall, our results suggest that ACR has introduced a strong undesired bias to the field, and better metrics are required to holistically evaluate randomized smoothing.

Related papers

Geometric Median Matching for Robust k-Subset Selection from Noisy Data [75.86423267723728]
We propose a novel k-subset selection strategy that leverages Geometric Median -- a robust estimator with an optimal breakdown point of 1/2. Our method iteratively selects a k-subset such that the mean of the subset approximates the GM of the (potentially) noisy dataset, ensuring robustness even under arbitrary corruption.
arXiv Detail & Related papers (2025-04-01T09:22:05Z)
Controllable RANSAC-based Anomaly Detection via Hypothesis Testing [7.10052009802944]
We propose a novel statistical method for testing the anomaly detection results obtained by RANSAC (controllable RANSAC) The key strength of the proposed method lies in its ability to control the probability of misidentifying anomalies below a pre-specified level. Experiments conducted on synthetic and real-world datasets robustly support our theoretical results.
arXiv Detail & Related papers (2024-10-19T15:15:41Z)
Fast Context-Biasing for CTC and Transducer ASR models with CTC-based Word Spotter [57.64003871384959]
This work presents a new approach to fast context-biasing with CTC-based Word Spotter. The proposed method matches CTC log-probabilities against a compact context graph to detect potential context-biasing candidates. The results demonstrate a significant acceleration of the context-biasing recognition with a simultaneous improvement in F-score and WER.
arXiv Detail & Related papers (2024-06-11T09:37:52Z)
Statistical Properties of Robust Satisficing [5.0139295307605325]
The Robust Satisficing (RS) model is an emerging approach to robust optimization. This paper comprehensively analyzes the theoretical properties of the RS model. Our experiments show that the RS model consistently outperforms the baseline empirical risk in small-sample regimes.
arXiv Detail & Related papers (2024-05-30T19:57:28Z)
Crossmodal ASR Error Correction with Discrete Speech Units [16.58209270191005]
We propose a post-ASR processing approach for ASR Error Correction (AEC) We explore pre-training and fine-tuning strategies and uncover an ASR domain discrepancy phenomenon. We propose the incorporation of discrete speech units to align with and enhance the word embeddings for improving AEC quality.
arXiv Detail & Related papers (2024-05-26T19:58:38Z)
The Vital Role of Gradient Clipping in Byzantine-Resilient Distributed Learning [8.268485501864939]
Byzantine-resilient distributed machine learning seeks to achieve robust learning performance in the presence of misbehaving or adversarial workers. While state-of-the-art (SOTA) robust distributed gradient descent (DGD) methods were proven theoretically optimal, their empirical success has often relied on pre-aggregation gradient clipping. We propose a principled adaptive clipping strategy, termed Adaptive Robust ClippingARC, to improve robustness against some attacks while being ineffective or detrimental against others.
arXiv Detail & Related papers (2024-05-23T11:00:31Z)
Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice. We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z)
Soft Random Sampling: A Theoretical and Empirical Analysis [59.719035355483875]
Soft random sampling (SRS) is a simple yet effective approach for efficient deep neural networks when dealing with massive data. It selects a uniformly speed at random with replacement from each data set in each epoch. It is shown to be a powerful and competitive strategy with significant and competitive performance on real-world industrial scale.
arXiv Detail & Related papers (2023-11-21T17:03:21Z)
Robust Generalization against Photon-Limited Corruptions via Worst-Case Sharpness Minimization [89.92932924515324]
Robust generalization aims to tackle the most challenging data distributions which are rare in the training set and contain severe noises. Common solutions such as distributionally robust optimization (DRO) focus on the worst-case empirical risk to ensure low training error. We propose SharpDRO by penalizing the sharpness of the worst-case distribution, which measures the loss changes around the neighbor of learning parameters. We show that SharpDRO exhibits a strong generalization ability against severe corruptions and exceeds well-known baseline methods with large performance gains.
arXiv Detail & Related papers (2023-03-23T07:58:48Z)
Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios. We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z)
Input-Specific Robustness Certification for Randomized Smoothing [76.76115360719837]
We propose Input-Specific Sampling (ISS) acceleration to achieve the cost-effectiveness for robustness certification. ISS can speed up the certification by more than three times at a limited cost of 0.05 certified radius.
arXiv Detail & Related papers (2021-12-21T12:16:03Z)
Generalized Real-World Super-Resolution through Adversarial Robustness [107.02188934602802]
We present Robust Super-Resolution, a method that leverages the generalization capability of adversarial attacks to tackle real-world SR. Our novel framework poses a paradigm shift in the development of real-world SR methods. By using a single robust model, we outperform state-of-the-art specialized methods on real-world benchmarks.
arXiv Detail & Related papers (2021-08-25T22:43:20Z)
Boosting Randomized Smoothing with Variance Reduced Classifiers [4.110108749051657]
We motivate why ensembles are a particularly suitable choice as base models for Randomized Smoothing (RS) We empirically confirm this choice, obtaining state of the art results in multiple settings.
arXiv Detail & Related papers (2021-06-13T08:40:27Z)
Risk Minimization from Adaptively Collected Data: Guarantees for Supervised and Policy Learning [57.88785630755165]
Empirical risk minimization (ERM) is the workhorse of machine learning, but its model-agnostic guarantees can fail when we use adaptively collected data. We study a generic importance sampling weighted ERM algorithm for using adaptively collected data to minimize the average of a loss function over a hypothesis class. For policy learning, we provide rate-optimal regret guarantees that close an open gap in the existing literature whenever exploration decays to zero.
arXiv Detail & Related papers (2021-06-03T09:50:13Z)
Adversarial Training with Rectified Rejection [114.83821848791206]
We propose to use true confidence (T-Con) as a certainty oracle, and learn to predict T-Con by rectifying confidence. We prove that under mild conditions, a rectified confidence (R-Con) rejector and a confidence rejector can be coupled to distinguish any wrongly classified input from correctly classified ones.
arXiv Detail & Related papers (2021-05-31T08:24:53Z)
RB-CCR: Radial-Based Combined Cleaning and Resampling algorithm for imbalanced data classification [5.448684866061922]
Resampling the training data is the standard approach to improving classification performance on imbalanced binary data. RB- CCR exploits the class potential to accurately locate sub-regions of the data-space for synthetic oversampling. Our results show that RB- CCR achieves a better precision-recall trade-off than CCR and generally out-performs the state-of-the-art resampling methods in terms of AUC and G-mean.
arXiv Detail & Related papers (2021-05-09T19:47:45Z)
Data Dependent Randomized Smoothing [127.34833801660233]
We show that our data dependent framework can be seamlessly incorporated into 3 randomized smoothing approaches. We get 9% and 6% improvement over the certified accuracy of the strongest baseline for a radius of 0.5 on CIFAR10 and ImageNet.
arXiv Detail & Related papers (2020-12-08T10:53:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.