Average Certified Radius is a Poor Metric for Randomized Smoothing
- URL: http://arxiv.org/abs/2410.06895v2
- Date: Fri, 31 Jan 2025 13:07:32 GMT
- Title: Average Certified Radius is a Poor Metric for Randomized Smoothing
- Authors: Chenhao Sun, Yuhao Mao, Mark Niklas Müller, Martin Vechev,
- Abstract summary: We show that the average certified radius (ACR) is a poor metric for evaluating robustness guarantees provided by randomized smoothing.
We propose strategies, including explicitly discarding hard samples, reweighting the dataset with approximate certified radius, and extreme optimization for easy samples, to achieve state-of-the-art ACR.
- Score: 7.960121888896864
- License:
- Abstract: Randomized smoothing is a popular approach for providing certified robustness guarantees against adversarial attacks, and has become an active area of research. Over the past years, the average certified radius (ACR) has emerged as the most important metric for comparing methods and tracking progress in the field. However, in this work, for the first time we show that ACR is a poor metric for evaluating robustness guarantees provided by randomized smoothing. We theoretically prove not only that a trivial classifier can have arbitrarily large ACR, but also that ACR is much more sensitive to improvements on easy samples than on hard ones. Empirically, we confirm that existing training strategies, though improving ACR with different approaches, reduce the model's robustness on hard samples consistently. To strengthen our conclusion, we propose strategies, including explicitly discarding hard samples, reweighting the dataset with approximate certified radius, and extreme optimization for easy samples, to achieve state-of-the-art ACR, without training for robustness on the full data distribution. Overall, our results suggest that ACR has introduced a strong undesired bias to the field, and its application should be discontinued when evaluating randomized smoothing.
Related papers
- Controllable RANSAC-based Anomaly Detection via Hypothesis Testing [7.10052009802944]
We propose a novel statistical method for testing the anomaly detection results obtained by RANSAC (controllable RANSAC)
The key strength of the proposed method lies in its ability to control the probability of misidentifying anomalies below a pre-specified level.
Experiments conducted on synthetic and real-world datasets robustly support our theoretical results.
arXiv Detail & Related papers (2024-10-19T15:15:41Z) - The Vital Role of Gradient Clipping in Byzantine-Resilient Distributed Learning [8.268485501864939]
Byzantine-resilient distributed machine learning seeks to achieve robust learning performance in the presence of misbehaving or adversarial workers.
While state-of-the-art (SOTA) robust distributed gradient descent (DGD) methods were proven theoretically optimal, their empirical success has often relied on pre-aggregation gradient clipping.
We propose a principled adaptive clipping strategy, termed Adaptive Robust ClippingARC, to improve robustness against some attacks while being ineffective or detrimental against others.
arXiv Detail & Related papers (2024-05-23T11:00:31Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - The Lipschitz-Variance-Margin Tradeoff for Enhanced Randomized Smoothing [85.85160896547698]
Real-life applications of deep neural networks are hindered by their unsteady predictions when faced with noisy inputs and adversarial attacks.
We show how to design an efficient classifier with a certified radius by relying on noise injection into the inputs.
Our novel certification procedure allows us to use pre-trained models with randomized smoothing, effectively improving the current certification radius in a zero-shot manner.
arXiv Detail & Related papers (2023-09-28T22:41:47Z) - Input-Specific Robustness Certification for Randomized Smoothing [76.76115360719837]
We propose Input-Specific Sampling (ISS) acceleration to achieve the cost-effectiveness for robustness certification.
ISS can speed up the certification by more than three times at a limited cost of 0.05 certified radius.
arXiv Detail & Related papers (2021-12-21T12:16:03Z) - Generalized Real-World Super-Resolution through Adversarial Robustness [107.02188934602802]
We present Robust Super-Resolution, a method that leverages the generalization capability of adversarial attacks to tackle real-world SR.
Our novel framework poses a paradigm shift in the development of real-world SR methods.
By using a single robust model, we outperform state-of-the-art specialized methods on real-world benchmarks.
arXiv Detail & Related papers (2021-08-25T22:43:20Z) - Boosting Randomized Smoothing with Variance Reduced Classifiers [4.110108749051657]
We motivate why ensembles are a particularly suitable choice as base models for Randomized Smoothing (RS)
We empirically confirm this choice, obtaining state of the art results in multiple settings.
arXiv Detail & Related papers (2021-06-13T08:40:27Z) - Risk Minimization from Adaptively Collected Data: Guarantees for
Supervised and Policy Learning [57.88785630755165]
Empirical risk minimization (ERM) is the workhorse of machine learning, but its model-agnostic guarantees can fail when we use adaptively collected data.
We study a generic importance sampling weighted ERM algorithm for using adaptively collected data to minimize the average of a loss function over a hypothesis class.
For policy learning, we provide rate-optimal regret guarantees that close an open gap in the existing literature whenever exploration decays to zero.
arXiv Detail & Related papers (2021-06-03T09:50:13Z) - Data Dependent Randomized Smoothing [127.34833801660233]
We show that our data dependent framework can be seamlessly incorporated into 3 randomized smoothing approaches.
We get 9% and 6% improvement over the certified accuracy of the strongest baseline for a radius of 0.5 on CIFAR10 and ImageNet.
arXiv Detail & Related papers (2020-12-08T10:53:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.