Related papers: Inter-layer Transition in Neural Architecture Search

Inter-layer Transition in Neural Architecture Search

URL: http://arxiv.org/abs/2011.14525v1
Date: Mon, 30 Nov 2020 03:33:52 GMT
Title: Inter-layer Transition in Neural Architecture Search
Authors: Benteng Ma, Jing Zhang, Yong Xia, Dacheng Tao
Abstract summary: The dependency between the architecture weights of connected edges is explicitly modeled in this paper. Experiments on five benchmarks confirm the value of modeling inter-layer dependency and demonstrate the proposed method outperforms state-of-the-art methods.
Score: 89.00449751022771
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Differential Neural Architecture Search (NAS) methods represent the network architecture as a repetitive proxy directed acyclic graph (DAG) and optimize the network weights and architecture weights alternatively in a differential manner. However, existing methods model the architecture weights on each edge (i.e., a layer in the network) as statistically independent variables, ignoring the dependency between edges in DAG induced by their directed topological connections. In this paper, we make the first attempt to investigate such dependency by proposing a novel Inter-layer Transition NAS method. It casts the architecture optimization into a sequential decision process where the dependency between the architecture weights of connected edges is explicitly modeled. Specifically, edges are divided into inner and outer groups according to whether or not their predecessor edges are in the same cell. While the architecture weights of outer edges are optimized independently, those of inner edges are derived sequentially based on the architecture weights of their predecessor edges and the learnable transition matrices in an attentive probability transition manner. Experiments on five benchmarks confirm the value of modeling inter-layer dependency and demonstrate the proposed method outperforms state-of-the-art methods.

Related papers

Spectral Architecture Search for Neural Networks [0.0]
We present a novel architecture search protocol which exploits the spectral attributes of the inter-layer transfer matrices. We show that the newly proposed method yields a self-emerging architecture with a minimal degree of expressivity to handle the task under investigation.
arXiv Detail & Related papers (2025-04-01T15:14:30Z)
Detecting and Approximating Redundant Computational Blocks in Neural Networks [25.436785396394804]
intra-network similarities present new opportunities for designing more efficient neural networks. We introduce a simple metric, Block Redundancy, to detect redundant blocks, and propose Redundant Blocks Approximation (RBA) to approximate redundant blocks. RBA reduces model parameters and time complexity while maintaining good performance.
arXiv Detail & Related papers (2024-10-07T11:35:24Z)
Explaining Modern Gated-Linear RNNs via a Unified Implicit Attention Formulation [54.50526986788175]
Recent advances in efficient sequence modeling have led to attention-free layers, such as Mamba, RWKV, and various gated RNNs. We present a unified view of these models, formulating such layers as implicit causal self-attention layers. Our framework compares the underlying mechanisms on similar grounds for different layers and provides a direct means for applying explainability methods.
arXiv Detail & Related papers (2024-05-26T09:57:45Z)
DepGraph: Towards Any Structural Pruning [68.40343338847664]
We study general structural pruning of arbitrary architecture like CNNs, RNNs, GNNs and Transformers. We propose a general and fully automatic method, emphDependency Graph (DepGraph), to explicitly model the dependency between layers and comprehensively group parameters for pruning. In this work, we extensively evaluate our method on several architectures and tasks, including ResNe(X)t, DenseNet, MobileNet and Vision transformer for images, GAT for graph, DGCNN for 3D point cloud, alongside LSTM for language, and demonstrate that, even with a
arXiv Detail & Related papers (2023-01-30T14:02:33Z)
Rethinking Architecture Selection in Differentiable NAS [74.61723678821049]
Differentiable Neural Architecture Search is one of the most popular NAS methods for its search efficiency and simplicity. We propose an alternative perturbation-based architecture selection that directly measures each operation's influence on the supernet. We find that several failure modes of DARTS can be greatly alleviated with the proposed selection method.
arXiv Detail & Related papers (2021-08-10T00:53:39Z)
iDARTS: Differentiable Architecture Search with Stochastic Implicit Gradients [75.41173109807735]
Differentiable ARchiTecture Search (DARTS) has recently become the mainstream of neural architecture search (NAS) We tackle the hypergradient computation in DARTS based on the implicit function theorem. We show that the architecture optimisation with the proposed method, named iDARTS, is expected to converge to a stationary point.
arXiv Detail & Related papers (2021-06-21T00:44:11Z)
Adversarially Robust Neural Architectures [43.74185132684662]
This paper aims to improve the adversarial robustness of the network from the architecture perspective with NAS framework. We explore the relationship among adversarial robustness, Lipschitz constant, and architecture parameters. Our algorithm empirically achieves the best performance among all the models under various attacks on different datasets.
arXiv Detail & Related papers (2020-09-02T08:52:15Z)
DC-NAS: Divide-and-Conquer Neural Architecture Search [108.57785531758076]
We present a divide-and-conquer (DC) approach to effectively and efficiently search deep neural architectures. We achieve a $75.1%$ top-1 accuracy on the ImageNet dataset, which is higher than that of state-of-the-art methods using the same search space.
arXiv Detail & Related papers (2020-05-29T09:02:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.