Related papers: SUBP: Soft Uniform Block Pruning for 1xN Sparse CNNs Multithreading Acceleration

SUBP: Soft Uniform Block Pruning for 1xN Sparse CNNs Multithreading Acceleration

URL: http://arxiv.org/abs/2310.06218v1
Date: Tue, 10 Oct 2023 00:22:27 GMT
Title: SUBP: Soft Uniform Block Pruning for 1xN Sparse CNNs Multithreading Acceleration
Authors: Jingyang Xiang and Siqi Li and Jun Chen and Shipeng Bai and Yukai Ma and Guang Dai and Yong Liu
Abstract summary: The study of sparsity in Convolutional Neural Networks (CNNs) has become widespread to compress and accelerate models in environments with limited resources. Recent work requires selecting and fine-tuning 1$times$N sparse weights based on dense pre-trained weights. This paper proposes a novel emphtextbfSoft textbfUniform textbfBlock textbfPruning (SUBP) approach to train a uniform 1$times$N sparse structured network from scratch.
Score: 16.846777341261436
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The study of sparsity in Convolutional Neural Networks (CNNs) has become widespread to compress and accelerate models in environments with limited resources. By constraining N consecutive weights along the output channel to be group-wise non-zero, the recent network with 1$\times$N sparsity has received tremendous popularity for its three outstanding advantages: 1) A large amount of storage space saving by a \emph{Block Sparse Row} matrix. 2) Excellent performance at a high sparsity. 3) Significant speedups on CPUs with Advanced Vector Extensions. Recent work requires selecting and fine-tuning 1$\times$N sparse weights based on dense pre-trained weights, leading to the problems such as expensive training cost and memory access, sub-optimal model quality, as well as unbalanced workload across threads (different sparsity across output channels). To overcome them, this paper proposes a novel \emph{\textbf{S}oft \textbf{U}niform \textbf{B}lock \textbf{P}runing} (SUBP) approach to train a uniform 1$\times$N sparse structured network from scratch. Specifically, our approach tends to repeatedly allow pruned blocks to regrow to the network based on block angular redundancy and importance sampling in a uniform manner throughout the training process. It not only makes the model less dependent on pre-training, reduces the model redundancy and the risk of pruning the important blocks permanently but also achieves balanced workload. Empirically, on ImageNet, comprehensive experiments across various CNN architectures show that our SUBP consistently outperforms existing 1$\times$N and structured sparsity methods based on pre-trained models or training from scratch. Source codes and models are available at \url{https://github.com/JingyangXiang/SUBP}.

Related papers

A Novel Structure-Agnostic Multi-Objective Approach for Weight-Sharing Compression in Deep Neural Networks [0.24578723416255746]
We propose a multi-objective evolutionary algorithm (MOEA) based compression framework independent of neural network architecture, dimension, task, and dataset. We use uniformly sized bins to quantize network weights into a single codebook for efficient weight representation. The experimental results show that we can reduce the neural network memory by $13.72 sim14.98 times$ on CIFAR-10, $11.61 sim 12.99times$ on CIFAR-100, and $7.44 sim 8.58times$ on ImageNet.
arXiv Detail & Related papers (2025-01-06T15:51:29Z)
Fully $1\times1$ Convolutional Network for Lightweight Image Super-Resolution [79.04007257606862]
Deep models have significant process on single image super-resolution (SISR) tasks, in particular large models with large kernel ($3times3$ or more) $1times1$ convolutions bring substantial computational efficiency, but struggle with aggregating local spatial representations. We propose a simple yet effective fully $1times1$ convolutional network, named Shift-Conv-based Network (SCNet)
arXiv Detail & Related papers (2023-07-30T06:24:03Z)
Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints [59.39280540478479]
We propose sparse upcycling -- a simple way to reuse sunk training costs by initializing a sparsely activated Mixture-of-Experts model from a dense checkpoint. We show that sparsely upcycled T5 Base, Large, and XL language models and Vision Transformer Base and Large models, respectively, significantly outperform their dense counterparts on SuperGLUE and ImageNet.
arXiv Detail & Related papers (2022-12-09T18:57:37Z)
Sparse Random Networks for Communication-Efficient Federated Learning [23.614934319624826]
One main challenge in federated learning is the large communication cost of exchanging weight updates from clients to the server at each round. We propose a radically different approach that does not update the weights at all. Instead, our method freezes the weights at their initial emphrandom values and learns how to sparsify the random network for the best performance.
arXiv Detail & Related papers (2022-09-30T09:11:09Z)
Not All Models Are Equal: Predicting Model Transferability in a Self-challenging Fisher Space [51.62131362670815]
This paper addresses the problem of ranking the pre-trained deep neural networks and screening the most transferable ones for downstream tasks. It proposes a new transferability metric called textbfSelf-challenging textbfFisher textbfDiscriminant textbfAnalysis (textbfSFDA)
arXiv Detail & Related papers (2022-07-07T01:33:25Z)
Superposing Many Tickets into One: A Performance Booster for Sparse Neural Network Training [32.30355584300427]
We present a novel sparse training approach, termed textbfSup-tickets, which can satisfy two desiderata concurrently in a single sparse-to-sparse training process. Across various modern architectures on CIFAR-10/100 and ImageNet, we show that Sup-tickets integrates seamlessly with the existing sparse training methods.
arXiv Detail & Related papers (2022-05-30T16:01:32Z)
FreeTickets: Accurate, Robust and Efficient Deep Ensemble by Training with Dynamic Sparsity [74.58777701536668]
We introduce the FreeTickets concept, which can boost the performance of sparse convolutional neural networks over their dense network equivalents by a large margin. We propose two novel efficient ensemble methods with dynamic sparsity, which yield in one shot many diverse and accurate tickets "for free" during the sparse training process.
arXiv Detail & Related papers (2021-06-28T10:48:20Z)
1$\times$N Block Pattern for Network Sparsity [90.43191747596491]
We propose one novel concept of $1times N$ block sparsity pattern (block pruning) to break this limitation. Our pattern obtains about 3.0% improvements over filter pruning in the top-1 accuracy of MobileNet-V2. It also obtains 56.04ms inference savings on Cortex-A7 CPU over weight pruning.
arXiv Detail & Related papers (2021-05-31T05:50:33Z)
Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch [75.69506249886622]
Sparsity in Deep Neural Networks (DNNs) has been widely studied to compress and accelerate the models on resource-constrained environments. In this paper, we are the first to study training from scratch an N:M fine-grained structured sparse network.
arXiv Detail & Related papers (2021-02-08T05:55:47Z)
Procrustes: a Dataflow and Accelerator for Sparse Deep Neural Network Training [0.5219568203653523]
We develop a sparse DNN training accelerator that produces pruned models with the same accuracy as dense models without first training, then pruning, and finally retraining, a dense model. Compared to training the equivalent unpruned models using a state-of-the-art DNN accelerator without sparse training support, Procrustes consumes up to 3.26$times$ less energy and offers up to 4$times$ speedup across a range of models, while pruning weights by an order of magnitude and maintaining unpruned accuracy.
arXiv Detail & Related papers (2020-09-23T07:39:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.