Exploring Neural Network Pruning with Screening Methods
- URL: http://arxiv.org/abs/2502.07189v1
- Date: Tue, 11 Feb 2025 02:31:04 GMT
- Title: Exploring Neural Network Pruning with Screening Methods
- Authors: Mingyuan Wang, Yangzi Guo, Sida Liu, Yanwen Xiao,
- Abstract summary: Modern deep learning models have tens of millions of parameters which makes the inference processes resource-intensive.
This paper proposes and evaluates a network pruning framework that eliminates non-essential parameters.
The proposed framework produces competitive lean networks compared to the original networks.
- Score: 3.443622476405787
- License:
- Abstract: Deep neural networks (DNNs) such as convolutional neural networks (CNNs) for visual tasks, recurrent neural networks (RNNs) for sequence data, and transformer models for rich linguistic or multimodal tasks, achieved unprecedented performance on a wide range of tasks. The impressive performance of modern DNNs is partially attributed to their sheer scale. The latest deep learning models have tens to hundreds of millions of parameters which makes the inference processes resource-intensive. The high computational complexity of these networks prevents their deployment on resource-limited devices such as mobile platforms, IoT devices, and edge computing systems because these devices require energy-efficient and real-time processing capabilities. This paper proposes and evaluates a network pruning framework that eliminates non-essential parameters based on a statistical analysis of network component significance across classification categories. The proposed method uses screening methods coupled with a weighted scheme to assess connection and channel contributions for unstructured and structured pruning which allows for the elimination of unnecessary network elements without significantly degrading model performance. Extensive experimental validation on real-world vision datasets for both fully connected neural networks (FNNs) and CNNs has shown that the proposed framework produces competitive lean networks compared to the original networks. Moreover, the proposed framework outperforms state-of-art network pruning methods in two out of three cases.
Related papers
- Neuromorphic Wireless Split Computing with Multi-Level Spikes [69.73249913506042]
Neuromorphic computing uses spiking neural networks (SNNs) to perform inference tasks.
embedding a small payload within each spike exchanged between spiking neurons can enhance inference accuracy without increasing energy consumption.
split computing - where an SNN is partitioned across two devices - is a promising solution.
This paper presents the first comprehensive study of a neuromorphic wireless split computing architecture that employs multi-level SNNs.
arXiv Detail & Related papers (2024-11-07T14:08:35Z) - Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - EvSegSNN: Neuromorphic Semantic Segmentation for Event Data [0.6138671548064356]
EvSegSNN is a biologically plausible encoder-decoder U-shaped architecture relying on Parametric Leaky Integrate and Fire neurons.
We introduce an end-to-end biologically inspired semantic segmentation approach by combining Spiking Neural Networks with event cameras.
Experiments conducted on DDD17 demonstrate that EvSegSNN outperforms the closest state-of-the-art model in terms of MIoU.
arXiv Detail & Related papers (2024-06-20T10:36:24Z) - A Faster Approach to Spiking Deep Convolutional Neural Networks [0.0]
Spiking neural networks (SNNs) have closer dynamics to the brain than current deep neural networks.
We propose a network structure based on previous work to improve network runtime and accuracy.
arXiv Detail & Related papers (2022-10-31T16:13:15Z) - Hybrid SNN-ANN: Energy-Efficient Classification and Object Detection for
Event-Based Vision [64.71260357476602]
Event-based vision sensors encode local pixel-wise brightness changes in streams of events rather than image frames.
Recent progress in object recognition from event-based sensors has come from conversions of deep neural networks.
We propose a hybrid architecture for end-to-end training of deep neural networks for event-based pattern recognition and object detection.
arXiv Detail & Related papers (2021-12-06T23:45:58Z) - CondenseNeXt: An Ultra-Efficient Deep Neural Network for Embedded
Systems [0.0]
A Convolutional Neural Network (CNN) is a class of Deep Neural Network (DNN) widely used in the analysis of visual images captured by an image sensor.
In this paper, we propose a neoteric variant of deep convolutional neural network architecture to ameliorate the performance of existing CNN architectures for real-time inference on embedded systems.
arXiv Detail & Related papers (2021-12-01T18:20:52Z) - Learning Frequency-aware Dynamic Network for Efficient Super-Resolution [56.98668484450857]
This paper explores a novel frequency-aware dynamic network for dividing the input into multiple parts according to its coefficients in the discrete cosine transform (DCT) domain.
In practice, the high-frequency part will be processed using expensive operations and the lower-frequency part is assigned with cheap operations to relieve the computation burden.
Experiments conducted on benchmark SISR models and datasets show that the frequency-aware dynamic network can be employed for various SISR neural architectures.
arXiv Detail & Related papers (2021-03-15T12:54:26Z) - Encoding the latent posterior of Bayesian Neural Networks for
uncertainty quantification [10.727102755903616]
We aim for efficient deep BNNs amenable to complex computer vision architectures.
We achieve this by leveraging variational autoencoders (VAEs) to learn the interaction and the latent distribution of the parameters at each network layer.
Our approach, Latent-Posterior BNN (LP-BNN), is compatible with the recent BatchEnsemble method, leading to highly efficient (in terms of computation and memory during both training and testing) ensembles.
arXiv Detail & Related papers (2020-12-04T19:50:09Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z) - Exploring the Connection Between Binary and Spiking Neural Networks [1.329054857829016]
We bridge the recent algorithmic progress in training Binary Neural Networks and Spiking Neural Networks.
We show that training Spiking Neural Networks in the extreme quantization regime results in near full precision accuracies on large-scale datasets.
arXiv Detail & Related papers (2020-02-24T03:46:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.