Random Search as a Baseline for Sparse Neural Network Architecture Search
- URL: http://arxiv.org/abs/2403.08265v2
- Date: Thu, 14 Mar 2024 05:18:08 GMT
- Title: Random Search as a Baseline for Sparse Neural Network Architecture Search
- Authors: Rezsa Farahani,
- Abstract summary: Sparse neural networks have shown similar or better performance than their dense counterparts while having higher parameter efficiency.
This has motivated a number of works to learn or search for high performing sparse networks.
We propose Random Search as a baseline algorithm for finding good sparse configurations and study its performance.
We observe that for this sparse architecture search task, sparse networks found by Random Search neither perform better nor converge more efficiently than their random counterparts.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Sparse neural networks have shown similar or better generalization performance than their dense counterparts while having higher parameter efficiency. This has motivated a number of works to learn or search for high performing sparse networks. While reports of task performance or efficiency gains are impressive, standard baselines are lacking leading to poor comparability and unreliable reproducibility across methods. In this work, we propose Random Search as a baseline algorithm for finding good sparse configurations and study its performance. We apply Random Search on the node space of an overparameterized network with the goal of finding better initialized sparse sub-networks that are positioned more advantageously in the loss landscape. We record the post-training performances of the found sparse networks and at various levels of sparsity, and compare against both their fully connected parent networks and random sparse configurations at the same sparsity levels. First, we demonstrate performance at different levels of sparsity and highlight that a significant level of performance can still be preserved even when the network is highly sparse. Second, we observe that for this sparse architecture search task, initialized sparse networks found by Random Search neither perform better nor converge more efficiently than their random counterparts. Thus we conclude that Random Search may be viewed as a reasonable neutral baseline for sparsity search methods.
Related papers
- Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together! [100.19080749267316]
"Sparsity May Cry" Benchmark (SMC-Bench) is a collection of carefully-curated 4 diverse tasks with 10 datasets.
SMC-Bench is designed to favor and encourage the development of more scalable and generalizable sparse algorithms.
arXiv Detail & Related papers (2023-03-03T18:47:21Z) - Why Random Pruning Is All We Need to Start Sparse [7.648170881733381]
Random masks define surprisingly effective sparse neural network models.
We show that sparser networks can compete with dense architectures and state-of-the-art lottery ticket pruning algorithms.
arXiv Detail & Related papers (2022-10-05T17:34:04Z) - The Unreasonable Effectiveness of Random Pruning: Return of the Most
Naive Baseline for Sparse Training [111.15069968583042]
Random pruning is arguably the most naive way to attain sparsity in neural networks, but has been deemed uncompetitive by either post-training pruning or sparse training.
We empirically demonstrate that sparsely training a randomly pruned network from scratch can match the performance of its dense equivalent.
Our results strongly suggest there is larger-than-expected room for sparse training at scale, and the benefits of sparsity might be more universal beyond carefully designed pruning.
arXiv Detail & Related papers (2022-02-05T21:19:41Z) - Community detection using low-dimensional network embedding algorithms [1.052782170493037]
We rigorously understand the performance of two major algorithms, DeepWalk and node2vec, in recovering communities for canonical network models.
We prove that, given some fixed co-occurrence window, node2vec using random walks with a low non-backtracking probability can succeed for much sparser networks.
arXiv Detail & Related papers (2021-11-04T14:57:43Z) - A Sparse Structure Learning Algorithm for Bayesian Network
Identification from Discrete High-Dimensional Data [0.40611352512781856]
This paper addresses the problem of learning a sparse structure Bayesian network from high-dimensional discrete data.
We propose a score function that satisfies the sparsity and the DAG property simultaneously.
Specifically, we use a variance reducing method in our optimization algorithm to make the algorithm work efficiently in high-dimensional data.
arXiv Detail & Related papers (2021-08-21T12:21:01Z) - Combined Depth Space based Architecture Search For Person
Re-identification [70.86236888223569]
We aim to design a lightweight and suitable network for person re-identification (ReID)
We propose a novel search space called Combined Depth Space (CDS), based on which we search for an efficient network architecture, which we call CDNet.
We then propose a low-cost search strategy named the Top-k Sample Search strategy to make full use of the search space and avoid trapping in local optimal result.
arXiv Detail & Related papers (2021-04-09T02:40:01Z) - Neural Architecture Search as Sparse Supernet [78.09905626281046]
This paper aims at enlarging the problem of Neural Architecture Search (NAS) from Single-Path and Multi-Path Search to automated Mixed-Path Search.
We model the NAS problem as a sparse supernet using a new continuous architecture representation with a mixture of sparsity constraints.
The sparse supernet enables us to automatically achieve sparsely-mixed paths upon a compact set of nodes.
arXiv Detail & Related papers (2020-07-31T14:51:52Z) - DC-NAS: Divide-and-Conquer Neural Architecture Search [108.57785531758076]
We present a divide-and-conquer (DC) approach to effectively and efficiently search deep neural architectures.
We achieve a $75.1%$ top-1 accuracy on the ImageNet dataset, which is higher than that of state-of-the-art methods using the same search space.
arXiv Detail & Related papers (2020-05-29T09:02:16Z) - Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.