Related papers: A Study on Adversarial Robustness of Discriminative Prototypical Learning

A Study on Adversarial Robustness of Discriminative Prototypical Learning

URL: http://arxiv.org/abs/2504.03782v1
Date: Thu, 03 Apr 2025 15:42:58 GMT
Title: A Study on Adversarial Robustness of Discriminative Prototypical Learning
Authors: Ramin Zarei Sabzevar, Hamed Mohammadzadeh, Tahmineh Tavakoli, Ahad Harati,
Abstract summary: We propose a novel adversarial training framework named Adversarial Deep Positive-Negative Prototypes (Adv-DPNP)<n>Adv-DPNP integrates disriminative prototype-based learning with adversarial training.<n>Our approach utilizes a composite loss function combining positive prototype alignment, negative prototype repulsion, and consistency regularization.
Score: 0.24999074238880484
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep neural networks demonstrate significant vulnerability to adversarial perturbations, posing risks for critical applications. Current adversarial training methods predominantly focus on robustness against attacks without explicitly leveraging geometric structures in the latent space, usually resulting in reduced accuracy on the original clean data. To address these issues, we propose a novel adversarial training framework named Adversarial Deep Positive-Negative Prototypes (Adv-DPNP), which integrates disriminative prototype-based learning with adversarial training. Adv-DPNP uses unified class prototypes serving dual roles as classifier weights and robust anchors, enhancing both intra-class compactness and inter-class separation in the latent space. Moreover, a novel dual-branch training mechanism maintains stable prototypes by updating them exclusively with clean data; while the feature extractor layers are learned using both clean and adversarial data to remain invariant against adversarial perturbations. In addition, our approach utilizes a composite loss function combining positive prototype alignment, negative prototype repulsion, and consistency regularization to further enhance discrimination, adversarial robustness, and clean accuracy. Extensive experiments conducted on standard benchmark datasets confirm the effectiveness of Adv-DPNP compared to state-of-the-art methods, achieving higher clean accuracy and competitive robustness under adversarial perturbations and common corruptions. Our code is available at https://github.com/fum-rpl/adv-dpnp

Related papers

MOREL: Enhancing Adversarial Robustness through Multi-Objective Representation Learning [1.534667887016089]
deep neural networks (DNNs) are vulnerable to slight adversarial perturbations.<n>We show that strong feature representation learning during training can significantly enhance the original model's robustness.<n>We propose MOREL, a multi-objective feature representation learning approach, encouraging classification models to produce similar features for inputs within the same class, despite perturbations.
arXiv Detail & Related papers (2024-10-02T16:05:03Z)
Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework. Our importance weights are obtained by optimizing the KL-divergence regularized loss function. Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z)
TWINS: A Fine-Tuning Framework for Improved Transferability of Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks. We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework. TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z)
Improving Adversarial Robustness to Sensitivity and Invariance Attacks with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample. We use metric learning to frame adversarial regularization as an optimal transport problem. Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z)
Resisting Adversarial Attacks in Deep Neural Networks using Diverse Decision Boundaries [12.312877365123267]
Deep learning systems are vulnerable to crafted adversarial examples, which may be imperceptible to the human eye, but can lead the model to misclassify. We develop a new ensemble-based solution that constructs defender models with diverse decision boundaries with respect to the original model. We present extensive experimentations using standard image classification datasets, namely MNIST, CIFAR-10 and CIFAR-100 against state-of-the-art adversarial attacks.
arXiv Detail & Related papers (2022-08-18T08:19:26Z)
Distributed Adversarial Training to Robustify Deep Neural Networks at Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification. To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training. We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z)
Robustness through Cognitive Dissociation Mitigation in Contrastive Adversarial Training [2.538209532048867]
We introduce a novel neural network training framework that increases model's adversarial robustness to adversarial attacks. We propose to improve model robustness to adversarial attacks by learning feature representations consistent under both data augmentations and adversarial perturbations. We validate our method on the CIFAR-10 dataset on which it outperforms both robust accuracy and clean accuracy over alternative supervised and self-supervised adversarial learning methods.
arXiv Detail & Related papers (2022-03-16T21:41:27Z)
Mitigating the Impact of Adversarial Attacks in Very Deep Networks [10.555822166916705]
Deep Neural Network (DNN) models have vulnerabilities related to security concerns. Data poisoning-enabled perturbation attacks are complex adversarial ones that inject false data into models. We propose an attack-agnostic-based defense method for mitigating their influence.
arXiv Detail & Related papers (2020-12-08T21:25:44Z)
Robust Pre-Training by Adversarial Contrastive Learning [120.33706897927391]
Recent work has shown that, when integrated with adversarial training, self-supervised pre-training can lead to state-of-the-art robustness. We improve robustness-aware self-supervised pre-training by learning representations consistent under both data augmentations and adversarial perturbations.
arXiv Detail & Related papers (2020-10-26T04:44:43Z)
FADER: Fast Adversarial Example Rejection [19.305796826768425]
Recent defenses have been shown to improve adversarial robustness by detecting anomalous deviations from legitimate training samples at different layer representations. We introduce FADER, a novel technique for speeding up detection-based methods. Our experiments outline up to 73x prototypes reduction compared to analyzed detectors for MNIST dataset and up to 50x for CIFAR10 respectively.
arXiv Detail & Related papers (2020-10-18T22:00:11Z)
Adversarial Self-Supervised Contrastive Learning [62.17538130778111]
Existing adversarial learning approaches mostly use class labels to generate adversarial samples that lead to incorrect predictions. We propose a novel adversarial attack for unlabeled data, which makes the model confuse the instance-level identities of the perturbed data samples. We present a self-supervised contrastive learning framework to adversarially train a robust neural network without labeled data.
arXiv Detail & Related papers (2020-06-13T08:24:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.