Enhancing Certified Robustness via Smoothed Weighted Ensembling
- URL: http://arxiv.org/abs/2005.09363v3
- Date: Tue, 23 Feb 2021 14:03:58 GMT
- Title: Enhancing Certified Robustness via Smoothed Weighted Ensembling
- Authors: Chizhou Liu, Yunzhen Feng, Ranran Wang, Bin Dong
- Abstract summary: We employ a Smoothed WEighted ENsembling scheme to improve the performance of randomized smoothed classifiers.
We show the ensembling generality that SWEEN can help achieve optimal certified robustness.
We also develop an adaptive prediction algorithm to reduce the prediction and certification cost of SWEEN models.
- Score: 7.217295098686032
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Randomized smoothing has achieved state-of-the-art certified robustness
against $l_2$-norm adversarial attacks. However, it is not wholly resolved on
how to find the optimal base classifier for randomized smoothing. In this work,
we employ a Smoothed WEighted ENsembling (SWEEN) scheme to improve the
performance of randomized smoothed classifiers. We show the ensembling
generality that SWEEN can help achieve optimal certified robustness.
Furthermore, theoretical analysis proves that the optimal SWEEN model can be
obtained from training under mild assumptions. We also develop an adaptive
prediction algorithm to reduce the prediction and certification cost of SWEEN
models. Extensive experiments show that SWEEN models outperform the upper
envelope of their corresponding candidate models by a large margin. Moreover,
SWEEN models constructed using a few small models can achieve comparable
performance to a single large model with a notable reduction in training time.
Related papers
- Supervised Score-Based Modeling by Gradient Boosting [49.556736252628745]
We propose a Supervised Score-based Model (SSM) which can be viewed as a gradient boosting algorithm combining score matching.
We provide a theoretical analysis of learning and sampling for SSM to balance inference time and prediction accuracy.
Our model outperforms existing models in both accuracy and inference time.
arXiv Detail & Related papers (2024-11-02T07:06:53Z) - Advancing the Robustness of Large Language Models through Self-Denoised Smoothing [50.54276872204319]
Large language models (LLMs) have achieved significant success, but their vulnerability to adversarial perturbations has raised considerable concerns.
We propose to leverage the multitasking nature of LLMs to first denoise the noisy inputs and then to make predictions based on these denoised versions.
Unlike previous denoised smoothing techniques in computer vision, which require training a separate model to enhance the robustness of LLMs, our method offers significantly better efficiency and flexibility.
arXiv Detail & Related papers (2024-04-18T15:47:00Z) - Towards Stable Machine Learning Model Retraining via Slowly Varying Sequences [6.067007470552307]
We propose a methodology for finding sequences of machine learning models that are stable across retraining iterations.
We develop a mixed-integer optimization formulation that is guaranteed to recover optimal models.
Our method shows stronger stability than greedily trained models with a small, controllable sacrifice in predictive power.
arXiv Detail & Related papers (2024-03-28T22:45:38Z) - Distributionally Robust Post-hoc Classifiers under Prior Shifts [31.237674771958165]
We investigate the problem of training models that are robust to shifts caused by changes in the distribution of class-priors or group-priors.
We present an extremely lightweight post-hoc approach that performs scaling adjustments to predictions from a pre-trained model.
arXiv Detail & Related papers (2023-09-16T00:54:57Z) - Model soups: averaging weights of multiple fine-tuned models improves
accuracy without increasing inference time [69.7693300927423]
We show that averaging the weights of multiple models fine-tuned with different hyper parameter configurations improves accuracy and robustness.
We show that the model soup approach extends to multiple image classification and natural language processing tasks.
arXiv Detail & Related papers (2022-03-10T17:03:49Z) - Sparse MoEs meet Efficient Ensembles [49.313497379189315]
We study the interplay of two popular classes of such models: ensembles of neural networks and sparse mixture of experts (sparse MoEs)
We present Efficient Ensemble of Experts (E$3$), a scalable and simple ensemble of sparse MoEs that takes the best of both classes of models, while using up to 45% fewer FLOPs than a deep ensemble.
arXiv Detail & Related papers (2021-10-07T11:58:35Z) - On the Certified Robustness for Ensemble Models and Beyond [22.43134152931209]
Deep neural networks (DNN) are vulnerable to adversarial examples, which aim to mislead them.
We analyze and provide the certified robustness for ensemble ML models.
Inspired by the theoretical findings, we propose the lightweight Diversity Regularized Training (DRT) to train certifiably robust ensemble ML models.
arXiv Detail & Related papers (2021-07-22T18:10:41Z) - Modeling the Second Player in Distributionally Robust Optimization [90.25995710696425]
We argue for the use of neural generative models to characterize the worst-case distribution.
This approach poses a number of implementation and optimization challenges.
We find that the proposed approach yields models that are more robust than comparable baselines.
arXiv Detail & Related papers (2021-03-18T14:26:26Z) - Insta-RS: Instance-wise Randomized Smoothing for Improved Robustness and
Accuracy [9.50143683501477]
Insta-RS is a multiple-start search algorithm that assigns customized Gaussian variances to test examples.
Insta-RS Train is a novel two-stage training algorithm that adaptively adjusts and customizes the noise level of each training example.
We show that our method significantly enhances the average certified radius (ACR) as well as the clean data accuracy.
arXiv Detail & Related papers (2021-03-07T19:46:07Z) - Dynamic Model Pruning with Feedback [64.019079257231]
We propose a novel model compression method that generates a sparse trained model without additional overhead.
We evaluate our method on CIFAR-10 and ImageNet, and show that the obtained sparse models can reach the state-of-the-art performance of dense models.
arXiv Detail & Related papers (2020-06-12T15:07:08Z) - Joint Stochastic Approximation and Its Application to Learning Discrete
Latent Variable Models [19.07718284287928]
We show that the difficulty of obtaining reliable gradients for the inference model and the drawback of indirectly optimizing the target log-likelihood can be gracefully addressed.
We propose to directly maximize the target log-likelihood and simultaneously minimize the inclusive divergence between the posterior and the inference model.
The resulting learning algorithm is called joint SA (JSA)
arXiv Detail & Related papers (2020-05-28T13:50:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.