On Multi-head Ensemble of Smoothed Classifiers for Certified Robustness
- URL: http://arxiv.org/abs/2211.10882v1
- Date: Sun, 20 Nov 2022 06:31:53 GMT
- Title: On Multi-head Ensemble of Smoothed Classifiers for Certified Robustness
- Authors: Kun Fang, Qinghua Tao, Yingwen Wu, Tao Li, Xiaolin Huang and Jie Yang
- Abstract summary: Randomized Smoothing (RS) is a promising technique for certified robustness.
In this work, we augment the network with multiple heads, each of which pertains a classifier for the ensemble.
A novel training strategy, namely Self-PAced Circular-TEaching (SPACTE), is proposed.
- Score: 25.021346715099863
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Randomized Smoothing (RS) is a promising technique for certified robustness,
and recently in RS the ensemble of multiple deep neural networks (DNNs) has
shown state-of-the-art performances. However, such an ensemble brings heavy
computation burdens in both training and certification, and yet under-exploits
individual DNNs and their mutual effects, as the communication between these
classifiers is commonly ignored in optimization. In this work, starting from a
single DNN, we augment the network with multiple heads, each of which pertains
a classifier for the ensemble. A novel training strategy, namely Self-PAced
Circular-TEaching (SPACTE), is proposed accordingly. SPACTE enables a circular
communication flow among those augmented heads, i.e., each head teaches its
neighbor with the self-paced learning using smoothed losses, which are
specifically designed in relation to certified robustness. The deployed
multi-head structure and the circular-teaching scheme of SPACTE jointly
contribute to diversify and enhance the classifiers in augmented heads for
ensemble, leading to even stronger certified robustness than ensembling
multiple DNNs (effectiveness) at the cost of much less computational expenses
(efficiency), verified by extensive experiments and discussions.
Related papers
- Divergent Ensemble Networks: Enhancing Uncertainty Estimation with Shared Representations and Independent Branching [0.9963916732353794]
Divergent Ensemble Network (DEN) is a novel architecture that combines shared representation learning with independent branching.
DEN employs a shared input layer to capture common features across all branches, followed by divergent, independently trainable layers that form an ensemble.
This shared-to-branching structure reduces parameter redundancy while maintaining ensemble diversity, enabling efficient and scalable learning.
arXiv Detail & Related papers (2024-12-02T06:52:45Z) - Informed deep hierarchical classification: a non-standard analysis inspired approach [0.0]
It consists in a multi-output deep neural network equipped with specific projection operators placed before each output layer.
The design of such an architecture, called lexicographic hybrid deep neural network (LH-DNN), has been possible by combining tools from different and quite distant research fields.
To assess the efficacy of the approach, the resulting network is compared against the B-CNN, a convolutional neural network tailored for hierarchical classification tasks.
arXiv Detail & Related papers (2024-09-25T14:12:50Z) - Unifying Synergies between Self-supervised Learning and Dynamic
Computation [53.66628188936682]
We present a novel perspective on the interplay between SSL and DC paradigms.
We show that it is feasible to simultaneously learn a dense and gated sub-network from scratch in a SSL setting.
The co-evolution during pre-training of both dense and gated encoder offers a good accuracy-efficiency trade-off.
arXiv Detail & Related papers (2023-01-22T17:12:58Z) - Deep Negative Correlation Classification [82.45045814842595]
Existing deep ensemble methods naively train many different models and then aggregate their predictions.
We propose deep negative correlation classification (DNCC)
DNCC yields a deep classification ensemble where the individual estimator is both accurate and negatively correlated.
arXiv Detail & Related papers (2022-12-14T07:35:20Z) - Comparative Analysis of Interval Reachability for Robust Implicit and
Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs)
INNs are a class of implicit learning models that use implicit equations as layers.
We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z) - Self-Ensembling GAN for Cross-Domain Semantic Segmentation [107.27377745720243]
This paper proposes a self-ensembling generative adversarial network (SE-GAN) exploiting cross-domain data for semantic segmentation.
In SE-GAN, a teacher network and a student network constitute a self-ensembling model for generating semantic segmentation maps, which together with a discriminator, forms a GAN.
Despite its simplicity, we find SE-GAN can significantly boost the performance of adversarial training and enhance the stability of the model.
arXiv Detail & Related papers (2021-12-15T09:50:25Z) - Distributed Learning for Time-varying Networks: A Scalable Design [13.657740129012804]
We propose a distributed learning framework based on a scalable deep neural network (DNN) design.
By exploiting the permutation equivalence and invariance properties of the learning tasks, the DNNs with different scales for different clients can be built up.
Model aggregation can also be conducted based on these two sub-matrices to improve the learning convergence and performance.
arXiv Detail & Related papers (2021-07-31T12:44:28Z) - A New Clustering-Based Technique for the Acceleration of Deep
Convolutional Networks [2.7393821783237184]
Model Compression and Acceleration (MCA) techniques are used to transform large pre-trained networks into smaller models.
We propose a clustering-based approach that is able to increase the number of employed centroids/representatives.
This is achieved by imposing a special structure to the employed representatives, which is enabled by the particularities of the problem at hand.
arXiv Detail & Related papers (2021-07-19T18:22:07Z) - Multi-task Over-the-Air Federated Learning: A Non-Orthogonal
Transmission Approach [52.85647632037537]
We propose a multi-task over-theair federated learning (MOAFL) framework, where multiple learning tasks share edge devices for data collection and learning models under the coordination of a edge server (ES)
Both the convergence analysis and numerical results demonstrate that the MOAFL framework can significantly reduce the uplink bandwidth consumption of multiple tasks without causing substantial learning performance degradation.
arXiv Detail & Related papers (2021-06-27T13:09:32Z) - Generalized Zero-Shot Learning Via Over-Complete Distribution [79.5140590952889]
We propose to generate an Over-Complete Distribution (OCD) using Conditional Variational Autoencoder (CVAE) of both seen and unseen classes.
The effectiveness of the framework is evaluated using both Zero-Shot Learning and Generalized Zero-Shot Learning protocols.
arXiv Detail & Related papers (2020-04-01T19:05:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.