Related papers: Improving Robust Generalization by Direct PAC-Bayesian Bound Minimization

Improving Robust Generalization by Direct PAC-Bayesian Bound Minimization

URL: http://arxiv.org/abs/2211.12624v1
Date: Tue, 22 Nov 2022 23:12:00 GMT
Title: Improving Robust Generalization by Direct PAC-Bayesian Bound Minimization
Authors: Zifan Wang, Nan Ding, Tomer Levinboim, Xi Chen, Radu Soricut
Abstract summary: Recent research has shown an overfitting-like phenomenon in which models trained against adversarial attacks exhibit higher robustness on the training set compared to the test set. In this paper we consider a different form of the robust PAC-Bayesian bound and directly minimize it with respect to the model posterior. We evaluate our TrH regularization approach over CIFAR-10/100 and ImageNet using Vision Transformers (ViT) and compare against baseline adversarial robustness algorithms.
Score: 27.31806334022094
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent research in robust optimization has shown an overfitting-like phenomenon in which models trained against adversarial attacks exhibit higher robustness on the training set compared to the test set. Although previous work provided theoretical explanations for this phenomenon using a robust PAC-Bayesian bound over the adversarial test error, related algorithmic derivations are at best only loosely connected to this bound, which implies that there is still a gap between their empirical success and our understanding of adversarial robustness theory. To close this gap, in this paper we consider a different form of the robust PAC-Bayesian bound and directly minimize it with respect to the model posterior. The derivation of the optimal solution connects PAC-Bayesian learning to the geometry of the robust loss surface through a Trace of Hessian (TrH) regularizer that measures the surface flatness. In practice, we restrict the TrH regularizer to the top layer only, which results in an analytical solution to the bound whose computational cost does not depend on the network depth. Finally, we evaluate our TrH regularization approach over CIFAR-10/100 and ImageNet using Vision Transformers (ViT) and compare against baseline adversarial robustness algorithms. Experimental results show that TrH regularization leads to improved ViT robustness that either matches or surpasses previous state-of-the-art approaches while at the same time requires less memory and computational cost.

Related papers

Outlier-aware Tensor Robust Principal Component Analysis with Self-guided Data Augmentation [21.981038455329013]
We propose a self-guided data augmentation approach that employs adaptive weighting to suppress outlier influence. We show the improvements in both accuracy and computational efficiency compared to state-of-the-art methods.
arXiv Detail & Related papers (2025-04-25T13:03:35Z)
ACMamba: Fast Unsupervised Anomaly Detection via An Asymmetrical Consensus State Space Model [51.83639270669481]
Unsupervised anomaly detection in hyperspectral images (HSI) aims to detect unknown targets from backgrounds. HSI studies are hindered by steep computational costs due to the high-dimensional property of HSI and dense sampling-based training paradigm. We propose an Asymmetrical Consensus State Space Model (ACMamba) to significantly reduce computational costs without compromising accuracy.
arXiv Detail & Related papers (2025-04-16T05:33:42Z)
Achieving $\widetilde{\mathcal{O}}(\sqrt{T})$ Regret in Average-Reward POMDPs with Known Observation Models [56.92178753201331]
We tackle average-reward infinite-horizon POMDPs with an unknown transition model. We present a novel and simple estimator that overcomes this barrier.
arXiv Detail & Related papers (2025-01-30T22:29:41Z)
Conformal Risk Minimization with Variance Reduction [37.74931189657469]
Conformal prediction (CP) is a distribution-free framework for achieving probabilistic guarantees on black-box models. Recent research efforts have focused on optimizing CP efficiency during training. We formalize this concept as the problem of conformal risk minimization.
arXiv Detail & Related papers (2024-11-03T21:48:15Z)
The Pitfalls and Promise of Conformal Inference Under Adversarial Attacks [90.52808174102157]
In safety-critical applications such as medical imaging and autonomous driving, it is imperative to maintain both high adversarial robustness to protect against potential adversarial attacks. A notable knowledge gap remains concerning the uncertainty inherent in adversarially trained models. This study investigates the uncertainty of deep learning models by examining the performance of conformal prediction (CP) in the context of standard adversarial attacks.
arXiv Detail & Related papers (2024-05-14T18:05:19Z)
Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework. Our importance weights are obtained by optimizing the KL-divergence regularized loss function. Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z)
High-dimensional Contextual Bandit Problem without Sparsity [8.782204980889077]
We propose an explore-then-commit (EtC) algorithm to address this problem and examine its performance. We derive the optimal rate of the ETC algorithm in terms of $T$ and show that this rate can be achieved by balancing exploration and exploitation. We introduce an adaptive explore-then-commit (AEtC) algorithm that adaptively finds the optimal balance.
arXiv Detail & Related papers (2023-06-19T15:29:32Z)
Improving Generalization of Complex Models under Unbounded Loss Using PAC-Bayes Bounds [10.94126149188336]
PAC-Bayes learning theory has focused extensively on establishing tight upper bounds for test errors. A recently proposed training procedure called PAC-Bayes training, updates the model toward minimizing these bounds. This approach is theoretically sound, in practice, it has not achieved a test error as low as those obtained by empirical risk minimization (ERM) We introduce a new PAC-Bayes training algorithm with improved performance and reduced reliance on prior tuning.
arXiv Detail & Related papers (2023-05-30T17:31:25Z)
Expressive Losses for Verified Robustness via Convex Combinations [67.54357965665676]
We study the relationship between the over-approximation coefficient and performance profiles across different expressive losses. We show that, while expressivity is essential, better approximations of the worst-case loss are not necessarily linked to superior robustness-accuracy trade-offs.
arXiv Detail & Related papers (2023-05-23T12:20:29Z)
On the Minimal Adversarial Perturbation for Deep Neural Networks with Provable Estimation Error [65.51757376525798]
The existence of adversarial perturbations has opened an interesting research line on provable robustness. No provable results have been presented to estimate and bound the error committed. This paper proposes two lightweight strategies to find the minimal adversarial perturbation. The obtained results show that the proposed strategies approximate the theoretical distance and robustness for samples close to the classification, leading to provable guarantees against any adversarial attacks.
arXiv Detail & Related papers (2022-01-04T16:40:03Z)
Robust Regularization with Adversarial Labelling of Perturbed Samples [22.37046166576859]
We propose Adversarial Labelling of Perturbed Samples (ALPS) as a regularization scheme. ALPS trains neural networks with synthetic samples formed by perturbing each authentic input sample towards another one along with an adversarially assigned label. Experiments on the SVHN, CIFAR-10, CIFAR-100 and Tiny-ImageNet datasets show that the ALPS has a state-of-the-art regularization performance.
arXiv Detail & Related papers (2021-05-28T11:26:49Z)
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality [131.45028999325797]
We develop a doubly robust off-policy AC (DR-Off-PAC) for discounted MDP. DR-Off-PAC adopts a single timescale structure, in which both actor and critics are updated simultaneously with constant stepsize. We study the finite-time convergence rate and characterize the sample complexity for DR-Off-PAC to attain an $epsilon$-accurate optimal policy.
arXiv Detail & Related papers (2021-02-23T18:56:13Z)
Adversarial Distributional Training for Robust Deep Learning [53.300984501078126]
Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples. Most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks. In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models.
arXiv Detail & Related papers (2020-02-14T12:36:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.