Adversarially Robust Neural Architectures
- URL: http://arxiv.org/abs/2009.00902v1
- Date: Wed, 2 Sep 2020 08:52:15 GMT
- Title: Adversarially Robust Neural Architectures
- Authors: Minjing Dong, Yanxi Li, Yunhe Wang and Chang Xu
- Abstract summary: This paper aims to improve the adversarial robustness of the network from the architecture perspective with NAS framework.
We explore the relationship among adversarial robustness, Lipschitz constant, and architecture parameters.
Our algorithm empirically achieves the best performance among all the models under various attacks on different datasets.
- Score: 43.74185132684662
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Neural Network (DNN) are vulnerable to adversarial attack. Existing
methods are devoted to developing various robust training strategies or
regularizations to update the weights of the neural network. But beyond the
weights, the overall structure and information flow in the network are
explicitly determined by the neural architecture, which remains unexplored.
This paper thus aims to improve the adversarial robustness of the network from
the architecture perspective with NAS framework. We explore the relationship
among adversarial robustness, Lipschitz constant, and architecture parameters
and show that an appropriate constraint on architecture parameters could reduce
the Lipschitz constant to further improve the robustness. For NAS framework,
all the architecture parameters are equally treated when the discrete
architecture is sampled from supernet. However, the importance of architecture
parameters could vary from operation to operation or connection to connection,
which is not explored and might reduce the confidence of robust architecture
sampling. Thus, we propose to sample architecture parameters from trainable
multivariate log-normal distributions, with which the Lipschitz constant of
entire network can be approximated using a univariate log-normal distribution
with mean and variance related to architecture parameters. Compared with
adversarially trained neural architectures searched by various NAS algorithms
as well as efficient human-designed models, our algorithm empirically achieves
the best performance among all the models under various attacks on different
datasets.
Related papers
- Principled Architecture-aware Scaling of Hyperparameters [69.98414153320894]
Training a high-quality deep neural network requires choosing suitable hyperparameters, which is a non-trivial and expensive process.
In this work, we precisely characterize the dependence of initializations and maximal learning rates on the network architecture.
We demonstrate that network rankings can be easily changed by better training networks in benchmarks.
arXiv Detail & Related papers (2024-02-27T11:52:49Z) - Set-based Neural Network Encoding Without Weight Tying [91.37161634310819]
We propose a neural network weight encoding method for network property prediction.
Our approach is capable of encoding neural networks in a model zoo of mixed architecture.
We introduce two new tasks for neural network property prediction: cross-dataset and cross-architecture.
arXiv Detail & Related papers (2023-05-26T04:34:28Z) - Learning Interpretable Models Through Multi-Objective Neural
Architecture Search [0.9990687944474739]
We propose a framework to optimize for both task performance and "introspectability," a surrogate metric for aspects of interpretability.
We demonstrate that jointly optimizing for task error and introspectability leads to more disentangled and debuggable architectures that perform within error.
arXiv Detail & Related papers (2021-12-16T05:50:55Z) - Rethinking Architecture Selection in Differentiable NAS [74.61723678821049]
Differentiable Neural Architecture Search is one of the most popular NAS methods for its search efficiency and simplicity.
We propose an alternative perturbation-based architecture selection that directly measures each operation's influence on the supernet.
We find that several failure modes of DARTS can be greatly alleviated with the proposed selection method.
arXiv Detail & Related papers (2021-08-10T00:53:39Z) - Reframing Neural Networks: Deep Structure in Overcomplete
Representations [41.84502123663809]
We introduce deep frame approximation, a unifying framework for representation learning with structured overcomplete frames.
We quantify structural differences with the deep frame potential, a data-independent measure of coherence linked to representation uniqueness and stability.
This connection to the established theory of overcomplete representations suggests promising new directions for principled deep network architecture design.
arXiv Detail & Related papers (2021-03-10T01:15:14Z) - Inter-layer Transition in Neural Architecture Search [89.00449751022771]
The dependency between the architecture weights of connected edges is explicitly modeled in this paper.
Experiments on five benchmarks confirm the value of modeling inter-layer dependency and demonstrate the proposed method outperforms state-of-the-art methods.
arXiv Detail & Related papers (2020-11-30T03:33:52Z) - Disentangling Neural Architectures and Weights: A Case Study in
Supervised Classification [8.976788958300766]
This work investigates the problem of disentangling the role of the neural structure and its edge weights.
We show that well-trained architectures may not need any link-specific fine-tuning of the weights.
We use a novel and computationally efficient method that translates the hard architecture-search problem into a feasible optimization problem.
arXiv Detail & Related papers (2020-09-11T11:22:22Z) - A Semi-Supervised Assessor of Neural Architectures [157.76189339451565]
We employ an auto-encoder to discover meaningful representations of neural architectures.
A graph convolutional neural network is introduced to predict the performance of architectures.
arXiv Detail & Related papers (2020-05-14T09:02:33Z) - Dataless Model Selection with the Deep Frame Potential [45.16941644841897]
We quantify networks by their intrinsic capacity for unique and robust representations.
We propose the deep frame potential: a measure of coherence that is approximately related to representation stability but has minimizers that depend only on network structure.
We validate its use as a criterion for model selection and demonstrate correlation with generalization error on a variety of common residual and densely connected network architectures.
arXiv Detail & Related papers (2020-03-30T23:27:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.