Supervised Contrastive Prototype Learning: Augmentation Free Robust
Neural Network
- URL: http://arxiv.org/abs/2211.14424v1
- Date: Sat, 26 Nov 2022 01:17:15 GMT
- Title: Supervised Contrastive Prototype Learning: Augmentation Free Robust
Neural Network
- Authors: Iordanis Fostiropoulos, Laurent Itti
- Abstract summary: Transformations in the input space of Deep Neural Networks (DNN) lead to unintended changes in the feature space.
We propose a training framework, $textbfd Contrastive Prototype Learning$ ( SCPL)
We use N-pair contrastive loss with prototypes of the same and opposite classes and replace a categorical classification head with a $textbfPrototype Classification Head$ (PCH)
Our approach is $textitsample efficient$, does not require $textitsample mining$, can be implemented on any existing DNN without modification to their
- Score: 17.10753224600936
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Transformations in the input space of Deep Neural Networks (DNN) lead to
unintended changes in the feature space. Almost perceptually identical inputs,
such as adversarial examples, can have significantly distant feature
representations. On the contrary, Out-of-Distribution (OOD) samples can have
highly similar feature representations to training set samples. Our theoretical
analysis for DNNs trained with a categorical classification head suggests that
the inflexible logit space restricted by the classification problem size is one
of the root causes for the lack of $\textit{robustness}$. Our second
observation is that DNNs over-fit to the training augmentation technique and do
not learn $\textit{nuance invariant}$ representations. Inspired by the recent
success of prototypical and contrastive learning frameworks for both improving
robustness and learning nuance invariant representations, we propose a training
framework, $\textbf{Supervised Contrastive Prototype Learning}$ (SCPL). We use
N-pair contrastive loss with prototypes of the same and opposite classes and
replace a categorical classification head with a $\textbf{Prototype
Classification Head}$ (PCH). Our approach is $\textit{sample efficient}$, does
not require $\textit{sample mining}$, can be implemented on any existing DNN
without modification to their architecture, and combined with other training
augmentation techniques. We empirically evaluate the $\textbf{clean}$
robustness of our method on out-of-distribution and adversarial samples. Our
framework outperforms other state-of-the-art contrastive and prototype learning
approaches in $\textit{robustness}$.
Related papers
- MOREL: Enhancing Adversarial Robustness through Multi-Objective Representation Learning [1.534667887016089]
deep neural networks (DNNs) are vulnerable to slight adversarial perturbations.
We show that strong feature representation learning during training can significantly enhance the original model's robustness.
We propose MOREL, a multi-objective feature representation learning approach, encouraging classification models to produce similar features for inputs within the same class, despite perturbations.
arXiv Detail & Related papers (2024-10-02T16:05:03Z) - Towards Understanding Clean Generalization and Robust Overfitting in Adversarial Training [38.44734564565478]
We study the $textitClean Generalization and Robust Overfitting phenomenon in adversarial training.
We show that a three-stage phase transition occurs during learning process and the network converges to robust memorization regime.
We also empirically verify our theoretical analysis by experiments in real-image recognition.
arXiv Detail & Related papers (2023-06-02T05:07:42Z) - Neural Collapse Inspired Feature-Classifier Alignment for Few-Shot Class
Incremental Learning [120.53458753007851]
Few-shot class-incremental learning (FSCIL) has been a challenging problem as only a few training samples are accessible for each novel class in the new sessions.
We deal with this misalignment dilemma in FSCIL inspired by the recently discovered phenomenon named neural collapse.
We propose a neural collapse inspired framework for FSCIL. Experiments on the miniImageNet, CUB-200, and CIFAR-100 datasets demonstrate that our proposed framework outperforms the state-of-the-art performances.
arXiv Detail & Related papers (2023-02-06T18:39:40Z) - Two Heads are Better than One: Robust Learning Meets Multi-branch Models [14.72099568017039]
We propose Branch Orthogonality adveRsarial Training (BORT) to obtain state-of-the-art performance with solely the original dataset for adversarial training.
We evaluate our approach on CIFAR-10, CIFAR-100, and SVHN against ell_infty norm-bounded perturbations of size epsilon = 8/255, respectively.
arXiv Detail & Related papers (2022-08-17T05:42:59Z) - Training Overparametrized Neural Networks in Sublinear Time [14.918404733024332]
Deep learning comes at a tremendous computational and energy cost.
We present a new and a subset of binary neural networks, as a small subset of search trees, where each corresponds to a subset of search trees (Ds)
We believe this view would have further applications in analysis analysis of deep networks (Ds)
arXiv Detail & Related papers (2022-08-09T02:29:42Z) - Large-Margin Representation Learning for Texture Classification [67.94823375350433]
This paper presents a novel approach combining convolutional layers (CLs) and large-margin metric learning for training supervised models on small datasets for texture classification.
The experimental results on texture and histopathologic image datasets have shown that the proposed approach achieves competitive accuracy with lower computational cost and faster convergence when compared to equivalent CNNs.
arXiv Detail & Related papers (2022-06-17T04:07:45Z) - Self-Ensembling GAN for Cross-Domain Semantic Segmentation [107.27377745720243]
This paper proposes a self-ensembling generative adversarial network (SE-GAN) exploiting cross-domain data for semantic segmentation.
In SE-GAN, a teacher network and a student network constitute a self-ensembling model for generating semantic segmentation maps, which together with a discriminator, forms a GAN.
Despite its simplicity, we find SE-GAN can significantly boost the performance of adversarial training and enhance the stability of the model.
arXiv Detail & Related papers (2021-12-15T09:50:25Z) - Probabilistic Robustness Analysis for DNNs based on PAC Learning [14.558877524991752]
We view a DNN as a function $boldsymbolf$ from inputs to outputs, and consider the local robustness property for a given input.
We learn the score difference function $f_i-f_ell$ with respect to the target label $ell$ and attacking label $i$.
Our framework can handle very large neural networks like ResNet152 with $6.5$M neurons, and often generates adversarial examples.
arXiv Detail & Related papers (2021-01-25T14:10:52Z) - Improving Robustness and Generality of NLP Models Using Disentangled
Representations [62.08794500431367]
Supervised neural networks first map an input $x$ to a single representation $z$, and then map $z$ to the output label $y$.
We present methods to improve robustness and generality of NLP models from the standpoint of disentangled representation learning.
We show that models trained with the proposed criteria provide better robustness and domain adaptation ability in a wide range of supervised learning tasks.
arXiv Detail & Related papers (2020-09-21T02:48:46Z) - Towards Understanding Hierarchical Learning: Benefits of Neural
Representations [160.33479656108926]
In this work, we demonstrate that intermediate neural representations add more flexibility to neural networks.
We show that neural representation can achieve improved sample complexities compared with the raw input.
Our results characterize when neural representations are beneficial, and may provide a new perspective on why depth is important in deep learning.
arXiv Detail & Related papers (2020-06-24T02:44:54Z) - Defense against Adversarial Attacks in NLP via Dirichlet Neighborhood
Ensemble [163.3333439344695]
Dirichlet Neighborhood Ensemble (DNE) is a randomized smoothing method for training a robust model to defense substitution-based attacks.
DNE forms virtual sentences by sampling embedding vectors for each word in an input sentence from a convex hull spanned by the word and its synonyms, and it augments them with the training data.
We demonstrate through extensive experimentation that our method consistently outperforms recently proposed defense methods by a significant margin across different network architectures and multiple data sets.
arXiv Detail & Related papers (2020-06-20T18:01:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.