Adversarially Robust Deep Learning with Optimal-Transport-Regularized
Divergences
- URL: http://arxiv.org/abs/2309.03791v1
- Date: Thu, 7 Sep 2023 15:41:45 GMT
- Title: Adversarially Robust Deep Learning with Optimal-Transport-Regularized
Divergences
- Authors: Jeremiah Birrell, Mohammadreza Ebrahimi
- Abstract summary: We introduce the $ARMOR_D$ methods as novel approaches to enhancing the adversarial robustness of deep learning models.
We demonstrate the effectiveness of our method on malware detection and image recognition applications.
- Score: 12.1942837946862
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce the $ARMOR_D$ methods as novel approaches to enhancing the
adversarial robustness of deep learning models. These methods are based on a
new class of optimal-transport-regularized divergences, constructed via an
infimal convolution between an information divergence and an optimal-transport
(OT) cost. We use these as tools to enhance adversarial robustness by
maximizing the expected loss over a neighborhood of distributions, a technique
known as distributionally robust optimization. Viewed as a tool for
constructing adversarial samples, our method allows samples to be both
transported, according to the OT cost, and re-weighted, according to the
information divergence. We demonstrate the effectiveness of our method on
malware detection and image recognition applications and find that, to our
knowledge, it outperforms existing methods at enhancing the robustness against
adversarial attacks. $ARMOR_D$ yields the robustified accuracy of $98.29\%$
against $FGSM$ and $98.18\%$ against $PGD^{40}$ on the MNIST dataset, reducing
the error rate by more than $19.7\%$ and $37.2\%$ respectively compared to
prior methods. Similarly, in malware detection, a discrete (binary) data
domain, $ARMOR_D$ improves the robustified accuracy under $rFGSM^{50}$ attack
compared to the previous best-performing adversarial training methods by
$37.0\%$ while lowering false negative and false positive rates by $51.1\%$ and
$57.53\%$, respectively.
Related papers
- Regret Minimization and Statistical Inference in Online Decision Making with High-dimensional Covariates [7.21848268647674]
We integrate the $varepsilon$-greedy bandit algorithm for decision-making with a hard thresholding algorithm for estimating sparse bandit parameters.
Under a margin condition, our method achieves either $O(T1/2)$ regret or classical $O(T1/2)$-consistent inference.
arXiv Detail & Related papers (2024-11-10T01:47:11Z) - Towards Robust Model-Based Reinforcement Learning Against Adversarial Corruption [60.958746600254884]
This study tackles the challenges of adversarial corruption in model-based reinforcement learning (RL)
We introduce an algorithm called corruption-robust optimistic MLE (CR-OMLE), which leverages total-variation (TV)-based information ratios as uncertainty weights for MLE.
We extend our weighting technique to the offline setting, and propose an algorithm named corruption-robust pessimistic MLE (CR-PMLE)
arXiv Detail & Related papers (2024-02-14T07:27:30Z) - Reducing Adversarial Training Cost with Gradient Approximation [0.3916094706589679]
We propose a new and efficient adversarial training method, adversarial training with gradient approximation (GAAT) to reduce the cost of building up robust models.
Our proposed method saves up to 60% of the training time with comparable model test accuracy on datasets.
arXiv Detail & Related papers (2023-09-18T03:55:41Z) - Enhancing Adversarial Training via Reweighting Optimization Trajectory [72.75558017802788]
A number of approaches have been proposed to address drawbacks such as extra regularization, adversarial weights, and training with more data.
We propose a new method named textbfWeighted Optimization Trajectories (WOT) that leverages the optimization trajectories of adversarial training in time.
Our results show that WOT integrates seamlessly with the existing adversarial training methods and consistently overcomes the robust overfitting issue.
arXiv Detail & Related papers (2023-06-25T15:53:31Z) - Robust Classification via a Single Diffusion Model [37.46217654590878]
Robust Diffusion (RDC) is a generative classifier constructed from a pre-trained diffusion model to be adversarially robust.
RDC achieves $75.67%$ robust accuracy against various $ell_infty$ norm-bounded adaptive attacks with $epsilon_infty/255$ on CIFAR-10.
arXiv Detail & Related papers (2023-05-24T15:25:19Z) - WAT: Improve the Worst-class Robustness in Adversarial Training [11.872656386839436]
Adversarial training is a popular strategy to defend against adversarial attacks.
Deep Neural Networks (DNN) have been shown to be vulnerable to adversarial examples.
This paper proposes a novel framework of worst-class adversarial training.
arXiv Detail & Related papers (2023-02-08T12:54:19Z) - Robust Few-shot Learning Without Using any Adversarial Samples [19.34427461937382]
A few efforts have been made to combine the few-shot problem with the robustness objective using sophisticated Meta-Learning techniques.
We propose a simple but effective alternative that does not require any adversarial samples.
Inspired by the cognitive decision-making process in humans, we enforce high-level feature matching between the base class data and their corresponding low-frequency samples.
arXiv Detail & Related papers (2022-11-03T05:58:26Z) - Training \beta-VAE by Aggregating a Learned Gaussian Posterior with a
Decoupled Decoder [0.553073476964056]
Current practices in VAE training often result in a trade-off between the reconstruction fidelity and the continuity$/$disentanglement of the latent space.
We present intuitions and a careful analysis of the antagonistic mechanism of the two losses, and propose a simple yet effective two-stage method for training a VAE.
We evaluate the method using a medical dataset intended for 3D skull reconstruction and shape completion, and the results indicate promising generative capabilities of the VAE trained using the proposed method.
arXiv Detail & Related papers (2022-09-29T13:49:57Z) - Two Heads are Better than One: Robust Learning Meets Multi-branch Models [14.72099568017039]
We propose Branch Orthogonality adveRsarial Training (BORT) to obtain state-of-the-art performance with solely the original dataset for adversarial training.
We evaluate our approach on CIFAR-10, CIFAR-100, and SVHN against ell_infty norm-bounded perturbations of size epsilon = 8/255, respectively.
arXiv Detail & Related papers (2022-08-17T05:42:59Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z) - Practical Evaluation of Adversarial Robustness via Adaptive Auto Attack [96.50202709922698]
A practical evaluation method should be convenient (i.e., parameter-free), efficient (i.e., fewer iterations) and reliable.
We propose a parameter-free Adaptive Auto Attack (A$3$) evaluation method which addresses the efficiency and reliability in a test-time-training fashion.
arXiv Detail & Related papers (2022-03-10T04:53:54Z) - Sparsity Winning Twice: Better Robust Generalization from More Efficient
Training [94.92954973680914]
We introduce two alternatives for sparse adversarial training: (i) static sparsity and (ii) dynamic sparsity.
We find both methods to yield win-win: substantially shrinking the robust generalization gap and alleviating the robust overfitting.
Our approaches can be combined with existing regularizers, establishing new state-of-the-art results in adversarial training.
arXiv Detail & Related papers (2022-02-20T15:52:08Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Mutual Adversarial Training: Learning together is better than going
alone [82.78852509965547]
We study how interactions among models affect robustness via knowledge distillation.
We propose mutual adversarial training (MAT) in which multiple models are trained together.
MAT can effectively improve model robustness and outperform state-of-the-art methods under white-box attacks.
arXiv Detail & Related papers (2021-12-09T15:59:42Z) - Mean-Shifted Contrastive Loss for Anomaly Detection [34.97652735163338]
We propose a new loss function which can overcome failure modes of both center-loss and contrastive-loss methods.
Our improvements yield a new anomaly detection approach, based on $textitMean-Shifted Contrastive Loss$.
Our method achieves state-of-the-art anomaly detection performance on multiple benchmarks including $97.5%$ ROC-AUC.
arXiv Detail & Related papers (2021-06-07T17:58:03Z) - Toward Adversarial Robustness via Semi-supervised Robust Training [93.36310070269643]
Adrial examples have been shown to be the severe threat to deep neural networks (DNNs)
We propose a novel defense method, the robust training (RT), by jointly minimizing two separated risks ($R_stand$ and $R_rob$)
arXiv Detail & Related papers (2020-03-16T02:14:08Z) - Debiased Off-Policy Evaluation for Recommendation Systems [8.63711086812655]
A/B tests are reliable, but are time- and money-consuming, and entail a risk of failure.
We develop an alternative method, which predicts the performance of algorithms given historical data.
Our method produces smaller mean squared errors than state-of-the-art methods.
arXiv Detail & Related papers (2020-02-20T02:30:02Z) - Adversarial Distributional Training for Robust Deep Learning [53.300984501078126]
Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples.
Most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks.
In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models.
arXiv Detail & Related papers (2020-02-14T12:36:59Z) - Discriminator optimal transport [6.624726878647543]
We show that discriminator optimization process increases a lower bound of the dual cost function for the Wasserstein distance between the target distribution $p$ and the generator distribution $p_G$.
It implies that the trained discriminator can approximate optimal transport (OT) from $p_G$ to $p$.
We show that it improves inception score and FID calculated by un-conditional GAN trained by CIFAR-10, STL-10 and a public pre-trained model of conditional GAN by ImageNet.
arXiv Detail & Related papers (2019-10-15T14:47:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.