Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
- URL: http://arxiv.org/abs/2303.02141v1
- Date: Fri, 3 Mar 2023 18:47:21 GMT
- Title: Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
- Authors: Shiwei Liu, Tianlong Chen, Zhenyu Zhang, Xuxi Chen, Tianjin Huang,
Ajay Jaiswal, Zhangyang Wang
- Abstract summary: "Sparsity May Cry" Benchmark (SMC-Bench) is a collection of carefully-curated 4 diverse tasks with 10 datasets.
SMC-Bench is designed to favor and encourage the development of more scalable and generalizable sparse algorithms.
- Score: 100.19080749267316
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Sparse Neural Networks (SNNs) have received voluminous attention
predominantly due to growing computational and memory footprints of
consistently exploding parameter count in large-scale models. Similar to their
dense counterparts, recent SNNs generalize just as well and are equipped with
numerous favorable benefits (e.g., low complexity, high scalability, and
robustness), sometimes even better than the original dense networks. As
research effort is focused on developing increasingly sophisticated sparse
algorithms, it is startling that a comprehensive benchmark to evaluate the
effectiveness of these algorithms has been highly overlooked. In absence of a
carefully crafted evaluation benchmark, most if not all, sparse algorithms are
evaluated against fairly simple and naive tasks (eg. CIFAR, ImageNet, GLUE,
etc.), which can potentially camouflage many advantages as well unexpected
predicaments of SNNs. In pursuit of a more general evaluation and unveiling the
true potential of sparse algorithms, we introduce "Sparsity May Cry" Benchmark
(SMC-Bench), a collection of carefully-curated 4 diverse tasks with 10
datasets, that accounts for capturing a wide range of domain-specific and
sophisticated knowledge. Our systemic evaluation of the most representative
sparse algorithms reveals an important obscured observation: the
state-of-the-art magnitude- and/or gradient-based sparse algorithms seemingly
fail to perform on SMC-Bench when applied out-of-the-box, sometimes at
significantly trivial sparsity as low as 5%. By incorporating these
well-thought and diverse tasks, SMC-Bench is designed to favor and encourage
the development of more scalable and generalizable sparse algorithms.
Related papers
- Tight Verification of Probabilistic Robustness in Bayesian Neural
Networks [17.499817915644467]
We introduce two algorithms for computing tight guarantees on the probabilistic robustness of Bayesian Neural Networks (BNNs)
Our algorithms efficiently search the parameters' space for safe weights by using iterative expansion and the network's gradient.
In addition to proving that our algorithms compute tighter bounds than the SoA, we also evaluate our algorithms against the SoA on standard benchmarks.
arXiv Detail & Related papers (2024-01-21T23:41:32Z) - Pursing the Sparse Limitation of Spiking Deep Learning Structures [42.334835610250714]
Spiking Neural Networks (SNNs) are garnering increased attention for their superior computation and energy efficiency.
We introduce an innovative algorithm capable of simultaneously identifying both weight and patch-level winning tickets.
We demonstrate that our spiking lottery ticket achieves comparable or superior performance even when the model structure is extremely sparse.
arXiv Detail & Related papers (2023-11-18T17:00:40Z) - Scalable Clustering: Large Scale Unsupervised Learning of Gaussian
Mixture Models with Outliers [5.478764356647437]
This paper introduces a provably robust clustering algorithm based on loss minimization.
It provides theoretical guarantees that the algorithm obtains high accuracy with high probability.
Experiments on real-world large-scale datasets demonstrate the effectiveness of the algorithm.
arXiv Detail & Related papers (2023-02-28T14:39:18Z) - Searching Large Neighborhoods for Integer Linear Programs with
Contrastive Learning [39.40838358438744]
Linear Programs (ILPs) are powerful tools for modeling and solving a large number of optimization problems.
Large Neighborhood Search (LNS), as a algorithm, can find high quality solutions to ILPs faster than Branch and Bound.
We propose a novel approach, CL-LNS, that delivers state-of-the-art anytime performance on several ILP benchmarks measured by metrics.
arXiv Detail & Related papers (2023-02-03T07:15:37Z) - Towards Better Out-of-Distribution Generalization of Neural Algorithmic
Reasoning Tasks [51.8723187709964]
We study the OOD generalization of neural algorithmic reasoning tasks.
The goal is to learn an algorithm from input-output pairs using deep neural networks.
arXiv Detail & Related papers (2022-11-01T18:33:20Z) - A Comprehensive Study on Large-Scale Graph Training: Benchmarking and
Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs)
We present a new ensembling training manner, named EnGCN, to address the existing issues.
Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z) - Inability of a graph neural network heuristic to outperform greedy
algorithms in solving combinatorial optimization problems like Max-Cut [0.0]
In Nature Machine Intelligence 4, 367 (2022), Schuetz et al provide a scheme to employ neural graph networks (GNN) to solve a variety of classical, NP-hard optimization problems.
It describes how the network is trained on sample instances and the resulting GNN is evaluated applying widely used techniques to determine its ability to succeed.
However, closer inspection shows that the reported results for this GNN are only minutely better than those for gradient descent and get outperformed by a greedy algorithm.
arXiv Detail & Related papers (2022-10-02T20:50:33Z) - Learning to Detect Critical Nodes in Sparse Graphs via Feature Importance Awareness [53.351863569314794]
The critical node problem (CNP) aims to find a set of critical nodes from a network whose deletion maximally degrades the pairwise connectivity of the residual network.
This work proposes a feature importance-aware graph attention network for node representation.
It combines it with dueling double deep Q-network to create an end-to-end algorithm to solve CNP for the first time.
arXiv Detail & Related papers (2021-12-03T14:23:05Z) - Effective Model Sparsification by Scheduled Grow-and-Prune Methods [73.03533268740605]
We propose a novel scheduled grow-and-prune (GaP) methodology without pre-training the dense models.
Experiments have shown that such models can match or beat the quality of highly optimized dense models at 80% sparsity on a variety of tasks.
arXiv Detail & Related papers (2021-06-18T01:03:13Z) - Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models.
The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning.
We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.