Related papers: Efficient Joint Optimization of Layer-Adaptive Weight Pruning in Deep Neural Networks

Efficient Joint Optimization of Layer-Adaptive Weight Pruning in Deep Neural Networks

URL: http://arxiv.org/abs/2308.10438v2
Date: Thu, 24 Aug 2023 07:43:18 GMT
Title: Efficient Joint Optimization of Layer-Adaptive Weight Pruning in Deep Neural Networks
Authors: Kaixin Xu, Zhe Wang, Xue Geng, Jie Lin, Min Wu, Xiaoli Li, Weisi Lin
Abstract summary: We propose a novel layer-adaptive weight-pruning approach for Deep Neural Networks (DNNs) Our approach takes into account the collective influence of all layers to design a layer-adaptive pruning scheme. Our experiments demonstrate the superiority of our approach over existing methods on the ImageNet and CIFAR-10 datasets.
Score: 48.089501687522954
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we propose a novel layer-adaptive weight-pruning approach for Deep Neural Networks (DNNs) that addresses the challenge of optimizing the output distortion minimization while adhering to a target pruning ratio constraint. Our approach takes into account the collective influence of all layers to design a layer-adaptive pruning scheme. We discover and utilize a very important additivity property of output distortion caused by pruning weights on multiple layers. This property enables us to formulate the pruning as a combinatorial optimization problem and efficiently solve it through dynamic programming. By decomposing the problem into sub-problems, we achieve linear time complexity, making our optimization algorithm fast and feasible to run on CPUs. Our extensive experiments demonstrate the superiority of our approach over existing methods on the ImageNet and CIFAR-10 datasets. On CIFAR-10, our method achieves remarkable improvements, outperforming others by up to 1.0% for ResNet-32, 0.5% for VGG-16, and 0.7% for DenseNet-121 in terms of top-1 accuracy. On ImageNet, we achieve up to 4.7% and 4.6% higher top-1 accuracy compared to other methods for VGG-16 and ResNet-50, respectively. These results highlight the effectiveness and practicality of our approach for enhancing DNN performance through layer-adaptive weight pruning. Code will be available on https://github.com/Akimoto-Cris/RD_VIT_PRUNE.

Related papers

Efficient Fault Detection in WSN Based on PCA-Optimized Deep Neural Network Slicing Trained with GOA [0.6827423171182154]
Traditional fault detection methods often struggle with optimizing deep neural networks (DNNs) for efficient performance.<n>This study proposes a novel hybrid method combining Principal Component Analysis (PCA) with a DNN optimized by the Grasshopper Optimization Algorithm (GOA) to address these limitations.<n>Our approach achieves a remarkable 99.72% classification accuracy, with exceptional precision and recall, outperforming conventional methods.
arXiv Detail & Related papers (2025-05-11T15:51:56Z)
Joint Pruning and Channel-wise Mixed-Precision Quantization for Efficient Deep Neural Networks [10.229120811024162]
deep neural networks (DNNs) pose significant challenges to their deployment on edge devices. Common approaches to address this issue are pruning and mixed-precision quantization. We propose a novel methodology to apply them jointly via a lightweight gradient-based search.
arXiv Detail & Related papers (2024-07-01T08:07:02Z)
Pruning Very Deep Neural Network Channels for Efficient Inference [6.497816402045099]
Given a trained CNN model, we propose an iterative two-step algorithm to effectively prune each layer. VGG-16 achieves the state-of-the-art results by 5x speed-up along with only 0.3% increase of error. Our method is able to accelerate modern networks like ResNet, Xception and suffers only 1.4%, 1.0% accuracy loss under 2x speed-up respectively.
arXiv Detail & Related papers (2022-11-14T06:48:33Z)
Pruning-as-Search: Efficient Neural Architecture Search via Channel Pruning and Structural Reparameterization [50.50023451369742]
Pruning-as-Search (PaS) is an end-to-end channel pruning method to search out desired sub-network automatically and efficiently. Our proposed architecture outperforms prior arts by around $1.0%$ top-1 accuracy on ImageNet-1000 classification task.
arXiv Detail & Related papers (2022-06-02T17:58:54Z)
Joint inference and input optimization in equilibrium networks [68.63726855991052]
deep equilibrium model is a class of models that foregoes traditional network depth and instead computes the output of a network by finding the fixed point of a single nonlinear layer. We show that there is a natural synergy between these two settings. We demonstrate this strategy on various tasks such as training generative models while optimizing over latent codes, training models for inverse problems like denoising and inpainting, adversarial training and gradient based meta-learning.
arXiv Detail & Related papers (2021-11-25T19:59:33Z)
Automatic Mapping of the Best-Suited DNN Pruning Schemes for Real-Time Mobile Acceleration [71.80326738527734]
We propose a general, fine-grained structured pruning scheme and corresponding compiler optimizations. We show that our pruning scheme mapping methods, together with the general fine-grained structured pruning scheme, outperform the state-of-the-art DNN optimization framework.
arXiv Detail & Related papers (2021-11-22T23:53:14Z)
Toward Compact Deep Neural Networks via Energy-Aware Pruning [2.578242050187029]
We propose a novel energy-aware pruning method that quantifies the importance of each filter in the network using nuclear-norm (NN) We achieve competitive results with 40.4/49.8% of FLOPs and 45.9/52.9% of parameter reduction with 94.13/94.61% in the Top-1 accuracy with ResNet-56/110 on CIFAR-10.
arXiv Detail & Related papers (2021-03-19T15:33:16Z)
Non-Parametric Adaptive Network Pruning [125.4414216272874]
We introduce non-parametric modeling to simplify the algorithm design. Inspired by the face recognition community, we use a message passing algorithm to obtain an adaptive number of exemplars. EPruner breaks the dependency on the training data in determining the "important" filters.
arXiv Detail & Related papers (2021-01-20T06:18:38Z)
Holistic Filter Pruning for Efficient Deep Neural Networks [25.328005340524825]
"Holistic Filter Pruning" (HFP) is a novel approach for common DNN training that is easy to implement and enables to specify accurate pruning rates. In various experiments, we give insights into the training and achieve state-of-the-art performance on CIFAR-10 and ImageNet.
arXiv Detail & Related papers (2020-09-17T09:23:36Z)
EagleEye: Fast Sub-net Evaluation for Efficient Neural Network Pruning [82.54669314604097]
EagleEye is a simple yet efficient evaluation component based on adaptive batch normalization. It unveils a strong correlation between different pruned structures and their final settled accuracy. This module is also general to plug-in and improve some existing pruning algorithms.
arXiv Detail & Related papers (2020-07-06T01:32:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.