Optimal Transport as a Defense Against Adversarial Attacks
- URL: http://arxiv.org/abs/2102.03156v1
- Date: Fri, 5 Feb 2021 13:24:36 GMT
- Title: Optimal Transport as a Defense Against Adversarial Attacks
- Authors: Quentin Bouniot, Romaric Audigier, Ang\'elique Loesch
- Abstract summary: Adversarial attacks can find a human-imperceptible perturbation for a given image that will mislead a trained model.
Previous work aimed to align original and adversarial image representations in the same way as domain adaptation to improve robustness.
We propose to use a loss between distributions that faithfully reflect the ground distance.
This leads to SAT (Sinkhorn Adversarial Training), a more robust defense against adversarial attacks.
- Score: 4.6193503399184275
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning classifiers are now known to have flaws in the representations
of their class. Adversarial attacks can find a human-imperceptible perturbation
for a given image that will mislead a trained model. The most effective methods
to defend against such attacks trains on generated adversarial examples to
learn their distribution. Previous work aimed to align original and adversarial
image representations in the same way as domain adaptation to improve
robustness. Yet, they partially align the representations using approaches that
do not reflect the geometry of space and distribution. In addition, it is
difficult to accurately compare robustness between defended models. Until now,
they have been evaluated using a fixed perturbation size. However, defended
models may react differently to variations of this perturbation size. In this
paper, the analogy of domain adaptation is taken a step further by exploiting
optimal transport theory. We propose to use a loss between distributions that
faithfully reflect the ground distance. This leads to SAT (Sinkhorn Adversarial
Training), a more robust defense against adversarial attacks. Then, we propose
to quantify more precisely the robustness of a model to adversarial attacks
over a wide range of perturbation sizes using a different metric, the Area
Under the Accuracy Curve (AUAC). We perform extensive experiments on both
CIFAR-10 and CIFAR-100 datasets and show that our defense is globally more
robust than the state-of-the-art.
Related papers
- Avoid Adversarial Adaption in Federated Learning by Multi-Metric
Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources.
FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks.
We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously.
MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z) - Improving Adversarial Robustness to Sensitivity and Invariance Attacks
with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample.
We use metric learning to frame adversarial regularization as an optimal transport problem.
Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z) - Detection and Mitigation of Byzantine Attacks in Distributed Training [24.951227624475443]
An abnormal Byzantine behavior of the worker nodes can derail the training and compromise the quality of the inference.
Recent work considers a wide range of attack models and has explored robust aggregation and/or computational redundancy to correct the distorted gradients.
In this work, we consider attack models ranging from strong ones: $q$ omniscient adversaries with full knowledge of the defense protocol that can change from iteration to iteration to weak ones: $q$ randomly chosen adversaries with limited collusion abilities.
arXiv Detail & Related papers (2022-08-17T05:49:52Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - Adaptive Clustering of Robust Semantic Representations for Adversarial
Image Purification [0.9203366434753543]
We propose a robust defense against adversarial attacks, which is model agnostic and generalizable to unseen adversaries.
In this paper, we extract the latent representations for each class and adaptively cluster the latent representations that share a semantic similarity.
We adversarially train a new model constraining the latent space representation to minimize the distance between the adversarial latent representation and the true cluster distribution.
arXiv Detail & Related papers (2021-04-05T21:07:04Z) - Lagrangian Objective Function Leads to Improved Unforeseen Attack
Generalization in Adversarial Training [0.0]
Adversarial training (AT) has been shown effective to reach a robust model against the attack that is used during training.
We propose a simple modification to the AT that mitigates the mentioned issue.
We show that our attack is faster than other attack schemes that are designed for unseen attack generalization.
arXiv Detail & Related papers (2021-03-29T07:23:46Z) - "What's in the box?!": Deflecting Adversarial Attacks by Randomly
Deploying Adversarially-Disjoint Models [71.91835408379602]
adversarial examples have been long considered a real threat to machine learning models.
We propose an alternative deployment-based defense paradigm that goes beyond the traditional white-box and black-box threat models.
arXiv Detail & Related papers (2021-02-09T20:07:13Z) - Robustness May Be at Odds with Fairness: An Empirical Study on
Class-wise Accuracy [85.20742045853738]
CNNs are widely known to be vulnerable to adversarial attacks.
We propose an empirical study on the class-wise accuracy and robustness of adversarially trained models.
We find that there exists inter-class discrepancy for accuracy and robustness even when the training dataset has an equal number of samples for each class.
arXiv Detail & Related papers (2020-10-26T06:32:32Z) - Stylized Adversarial Defense [105.88250594033053]
adversarial training creates perturbation patterns and includes them in the training set to robustify the model.
We propose to exploit additional information from the feature space to craft stronger adversaries.
Our adversarial training approach demonstrates strong robustness compared to state-of-the-art defenses.
arXiv Detail & Related papers (2020-07-29T08:38:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.