Adversarially Robust Classification based on GLRT
- URL: http://arxiv.org/abs/2011.07835v1
- Date: Mon, 16 Nov 2020 10:16:05 GMT
- Title: Adversarially Robust Classification based on GLRT
- Authors: Bhagyashree Puranik, Upamanyu Madhow, Ramtin Pedarsani
- Abstract summary: We show a defense strategy based on the generalized likelihood ratio test (GLRT), which jointly estimates the class of interest and the adversarial perturbation.
We show that the GLRT approach yields performance competitive with that of the minimax approach under the worst-case attack.
We also observe that the GLRT defense generalizes naturally to more complex models for which optimal minimax classifiers are not known.
- Score: 26.44693169694826
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning models are vulnerable to adversarial attacks that can often
cause misclassification by introducing small but well designed perturbations.
In this paper, we explore, in the setting of classical composite hypothesis
testing, a defense strategy based on the generalized likelihood ratio test
(GLRT), which jointly estimates the class of interest and the adversarial
perturbation. We evaluate the GLRT approach for the special case of binary
hypothesis testing in white Gaussian noise under $\ell_{\infty}$ norm-bounded
adversarial perturbations, a setting for which a minimax strategy optimizing
for the worst-case attack is known. We show that the GLRT approach yields
performance competitive with that of the minimax approach under the worst-case
attack, and observe that it yields a better robustness-accuracy trade-off under
weaker attacks, depending on the values of signal components relative to the
attack budget. We also observe that the GLRT defense generalizes naturally to
more complex models for which optimal minimax classifiers are not known.
Related papers
- Minimax rates of convergence for nonparametric regression under adversarial attacks [3.244945627960733]
We theoretically analyse the limits of robustness against adversarial attacks in a nonparametric regression setting.
Our work reveals that the minimax rate under adversarial attacks in the input is the same as sum of two terms.
arXiv Detail & Related papers (2024-10-12T07:11:38Z) - Familiarity-Based Open-Set Recognition Under Adversarial Attacks [9.934489379453812]
We study gradient-based adversarial attacks on familiarity scores for both types of attacks, False Familiarity and False Novelty attacks.
We formulate the adversarial reaction score as an alternative OSR scoring rule, which shows a high correlation with the MLS familiarity score.
arXiv Detail & Related papers (2023-11-08T20:17:35Z) - Advancing Adversarial Robustness Through Adversarial Logit Update [10.041289551532804]
Adversarial training and adversarial purification are among the most widely recognized defense strategies.
We propose a new principle, namely Adversarial Logit Update (ALU), to infer adversarial sample's labels.
Our solution achieves superior performance compared to state-of-the-art methods against a wide range of adversarial attacks.
arXiv Detail & Related papers (2023-08-29T07:13:31Z) - Towards Compositional Adversarial Robustness: Generalizing Adversarial
Training to Composite Semantic Perturbations [70.05004034081377]
We first propose a novel method for generating composite adversarial examples.
Our method can find the optimal attack composition by utilizing component-wise projected gradient descent.
We then propose generalized adversarial training (GAT) to extend model robustness from $ell_p$-ball to composite semantic perturbations.
arXiv Detail & Related papers (2022-02-09T02:41:56Z) - Interpolated Joint Space Adversarial Training for Robust and
Generalizable Defenses [82.3052187788609]
Adversarial training (AT) is considered to be one of the most reliable defenses against adversarial attacks.
Recent works show generalization improvement with adversarial samples under novel threat models.
We propose a novel threat model called Joint Space Threat Model (JSTM)
Under JSTM, we develop novel adversarial attacks and defenses.
arXiv Detail & Related papers (2021-12-12T21:08:14Z) - Generalized Likelihood Ratio Test for Adversarially Robust Hypothesis
Testing [22.93223530210401]
We consider a classical hypothesis testing problem in order to develop insight into defending against such adversarial perturbations.
We propose a defense based on applying the generalized likelihood ratio test (GLRT) to the resulting composite hypothesis testing problem.
We show via simulations that the GLRT defense is competitive with the minimax approach under the worst-case attack, while yielding a better-accuracy tradeoff under weaker attacks.
arXiv Detail & Related papers (2021-12-04T01:11:54Z) - CC-Cert: A Probabilistic Approach to Certify General Robustness of
Neural Networks [58.29502185344086]
In safety-critical machine learning applications, it is crucial to defend models against adversarial attacks.
It is important to provide provable guarantees for deep learning models against semantically meaningful input transformations.
We propose a new universal probabilistic certification approach based on Chernoff-Cramer bounds.
arXiv Detail & Related papers (2021-09-22T12:46:04Z) - Policy Smoothing for Provably Robust Reinforcement Learning [109.90239627115336]
We study the provable robustness of reinforcement learning against norm-bounded adversarial perturbations of the inputs.
We generate certificates that guarantee that the total reward obtained by the smoothed policy will not fall below a certain threshold under a norm-bounded adversarial of perturbation the input.
arXiv Detail & Related papers (2021-06-21T21:42:08Z) - Risk Minimization from Adaptively Collected Data: Guarantees for
Supervised and Policy Learning [57.88785630755165]
Empirical risk minimization (ERM) is the workhorse of machine learning, but its model-agnostic guarantees can fail when we use adaptively collected data.
We study a generic importance sampling weighted ERM algorithm for using adaptively collected data to minimize the average of a loss function over a hypothesis class.
For policy learning, we provide rate-optimal regret guarantees that close an open gap in the existing literature whenever exploration decays to zero.
arXiv Detail & Related papers (2021-06-03T09:50:13Z) - Adversarial Distributional Training for Robust Deep Learning [53.300984501078126]
Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples.
Most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks.
In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models.
arXiv Detail & Related papers (2020-02-14T12:36:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.