Structured Bayesian Compression for Deep Neural Networks Based on The
Turbo-VBI Approach
- URL: http://arxiv.org/abs/2302.10483v1
- Date: Tue, 21 Feb 2023 07:12:36 GMT
- Title: Structured Bayesian Compression for Deep Neural Networks Based on The
Turbo-VBI Approach
- Authors: Chengyu Xia, Danny H.K. Tsang, Vincent K.N. Lau
- Abstract summary: In most existing pruning methods, surviving neurons are randomly connected in the neural network without any structure.
We propose a three-layer hierarchical prior to promote a more regular sparse structure during pruning.
We derive an efficient Turbo-variational Bayesian inferencing (Turbo-VBI) algorithm to solve the resulting model compression problem.
- Score: 23.729955669774977
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the growth of neural network size, model compression has attracted
increasing interest in recent research. As one of the most common techniques,
pruning has been studied for a long time. By exploiting the structured sparsity
of the neural network, existing methods can prune neurons instead of individual
weights. However, in most existing pruning methods, surviving neurons are
randomly connected in the neural network without any structure, and the
non-zero weights within each neuron are also randomly distributed. Such
irregular sparse structure can cause very high control overhead and irregular
memory access for the hardware and even increase the neural network
computational complexity. In this paper, we propose a three-layer hierarchical
prior to promote a more regular sparse structure during pruning. The proposed
three-layer hierarchical prior can achieve per-neuron weight-level structured
sparsity and neuron-level structured sparsity. We derive an efficient
Turbo-variational Bayesian inferencing (Turbo-VBI) algorithm to solve the
resulting model compression problem with the proposed prior. The proposed
Turbo-VBI algorithm has low complexity and can support more general priors than
existing model compression algorithms. Simulation results show that our
proposed algorithm can promote a more regular structure in the pruned neural
networks while achieving even better performance in terms of compression rate
and inferencing accuracy compared with the baselines.
Related papers
- Spike-and-slab shrinkage priors for structurally sparse Bayesian neural networks [0.16385815610837165]
Sparse deep learning addresses challenges by recovering a sparse representation of the underlying target function.
Deep neural architectures compressed via structured sparsity provide low latency inference, higher data throughput, and reduced energy consumption.
We propose structurally sparse Bayesian neural networks which prune excessive nodes with (i) Spike-and-Slab Group Lasso (SS-GL), and (ii) Spike-and-Slab Group Horseshoe (SS-GHS) priors.
arXiv Detail & Related papers (2023-08-17T17:14:18Z) - Can Unstructured Pruning Reduce the Depth in Deep Neural Networks? [5.869633234882029]
Pruning is a widely used technique for reducing the size of deep neural networks while maintaining their performance.
In this study, we introduce EGP, an innovative Entropy Guided Pruning algorithm aimed at reducing the size of deep neural networks while preserving their performance.
arXiv Detail & Related papers (2023-08-12T17:27:49Z) - Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence.
We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers.
This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z) - SA-CNN: Application to text categorization issues using simulated
annealing-based convolutional neural network optimization [0.0]
Convolutional neural networks (CNNs) are a representative class of deep learning algorithms.
We introduce SA-CNN neural networks for text classification tasks based on Text-CNN neural networks.
arXiv Detail & Related papers (2023-03-13T14:27:34Z) - Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption.
They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware.
A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z) - Optimal Learning Rates of Deep Convolutional Neural Networks: Additive
Ridge Functions [19.762318115851617]
We consider the mean squared error analysis for deep convolutional neural networks.
We show that, for additive ridge functions, convolutional neural networks followed by one fully connected layer with ReLU activation functions can reach optimal mini-max rates.
arXiv Detail & Related papers (2022-02-24T14:22:32Z) - Layer Adaptive Node Selection in Bayesian Neural Networks: Statistical
Guarantees and Implementation Details [0.5156484100374059]
Sparse deep neural networks have proven to be efficient for predictive model building in large-scale studies.
We propose a Bayesian sparse solution using spike-and-slab Gaussian priors to allow for node selection during training.
We establish the fundamental result of variational posterior consistency together with the characterization of prior parameters.
arXiv Detail & Related papers (2021-08-25T00:48:07Z) - Learning Structures for Deep Neural Networks [99.8331363309895]
We propose to adopt the efficient coding principle, rooted in information theory and developed in computational neuroscience.
We show that sparse coding can effectively maximize the entropy of the output signals.
Our experiments on a public image classification dataset demonstrate that using the structure learned from scratch by our proposed algorithm, one can achieve a classification accuracy comparable to the best expert-designed structure.
arXiv Detail & Related papers (2021-05-27T12:27:24Z) - Trilevel Neural Architecture Search for Efficient Single Image
Super-Resolution [127.92235484598811]
This paper proposes a trilevel neural architecture search (NAS) method for efficient single image super-resolution (SR)
For modeling the discrete search space, we apply a new continuous relaxation on the discrete search spaces to build a hierarchical mixture of network-path, cell-operations, and kernel-width.
An efficient search algorithm is proposed to perform optimization in a hierarchical supernet manner.
arXiv Detail & Related papers (2021-01-17T12:19:49Z) - Rectified Linear Postsynaptic Potential Function for Backpropagation in
Deep Spiking Neural Networks [55.0627904986664]
Spiking Neural Networks (SNNs) usetemporal spike patterns to represent and transmit information, which is not only biologically realistic but also suitable for ultra-low-power event-driven neuromorphic implementation.
This paper investigates the contribution of spike timing dynamics to information encoding, synaptic plasticity and decision making, providing a new perspective to design of future DeepSNNs and neuromorphic hardware systems.
arXiv Detail & Related papers (2020-03-26T11:13:07Z) - Understanding Generalization in Deep Learning via Tensor Methods [53.808840694241]
We advance the understanding of the relations between the network's architecture and its generalizability from the compression perspective.
We propose a series of intuitive, data-dependent and easily-measurable properties that tightly characterize the compressibility and generalizability of neural networks.
arXiv Detail & Related papers (2020-01-14T22:26:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.