A Unified DNN Weight Compression Framework Using Reweighted Optimization
Methods
- URL: http://arxiv.org/abs/2004.05531v1
- Date: Sun, 12 Apr 2020 02:59:06 GMT
- Title: A Unified DNN Weight Compression Framework Using Reweighted Optimization
Methods
- Authors: Tianyun Zhang, Xiaolong Ma, Zheng Zhan, Shanglin Zhou, Minghai Qin,
Fei Sun, Yen-Kuang Chen, Caiwen Ding, Makan Fardad and Yanzhi Wang
- Abstract summary: We propose a unified DNN weight pruning framework with dynamically updated regularization terms bounded by the designated constraint.
We also extend our method to an integrated framework for the combination of different DNN compression tasks.
- Score: 31.869228048294445
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To address the large model size and intensive computation requirement of deep
neural networks (DNNs), weight pruning techniques have been proposed and
generally fall into two categories, i.e., static regularization-based pruning
and dynamic regularization-based pruning. However, the former method currently
suffers either complex workloads or accuracy degradation, while the latter one
takes a long time to tune the parameters to achieve the desired pruning rate
without accuracy loss. In this paper, we propose a unified DNN weight pruning
framework with dynamically updated regularization terms bounded by the
designated constraint, which can generate both non-structured sparsity and
different kinds of structured sparsity. We also extend our method to an
integrated framework for the combination of different DNN compression tasks.
Related papers
- Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures.
This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead.
We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z) - Automatic Mapping of the Best-Suited DNN Pruning Schemes for Real-Time
Mobile Acceleration [71.80326738527734]
We propose a general, fine-grained structured pruning scheme and corresponding compiler optimizations.
We show that our pruning scheme mapping methods, together with the general fine-grained structured pruning scheme, outperform the state-of-the-art DNN optimization framework.
arXiv Detail & Related papers (2021-11-22T23:53:14Z) - Only Train Once: A One-Shot Neural Network Training And Pruning
Framework [31.959625731943675]
Structured pruning is a commonly used technique in deploying deep neural networks (DNNs) onto resource-constrained devices.
We propose a framework that DNNs are slimmer with competitive performances and significant FLOPs reductions by Only-Train-Once (OTO)
OTO contains two keys: (i) we partition the parameters of DNNs into zero-invariant groups, enabling us to prune zero groups without affecting the output; and (ii) to promote zero groups, we then formulate a structured-Image optimization algorithm, Half-Space Projected (HSPG)
To demonstrate the effectiveness of OTO, we train and
arXiv Detail & Related papers (2021-07-15T17:15:20Z) - Better Training using Weight-Constrained Stochastic Dynamics [0.0]
We employ constraints to control the parameter space of deep neural networks throughout training.
The use of customized, appropriately designed constraints can reduce the vanishing/exploding problem.
We provide a general approach to efficiently incorporate constraints into a gradient Langevin framework.
arXiv Detail & Related papers (2021-06-20T14:41:06Z) - Efficient Micro-Structured Weight Unification and Pruning for Neural
Network Compression [56.83861738731913]
Deep Neural Network (DNN) models are essential for practical applications, especially for resource limited devices.
Previous unstructured or structured weight pruning methods can hardly truly accelerate inference.
We propose a generalized weight unification framework at a hardware compatible micro-structured level to achieve high amount of compression and acceleration.
arXiv Detail & Related papers (2021-06-15T17:22:59Z) - Dynamic Probabilistic Pruning: A general framework for
hardware-constrained pruning at different granularities [80.06422693778141]
We propose a flexible new pruning mechanism that facilitates pruning at different granularities (weights, kernels, filters/feature maps)
We refer to this algorithm as Dynamic Probabilistic Pruning (DPP)
We show that DPP achieves competitive compression rates and classification accuracy when pruning common deep learning models trained on different benchmark datasets for image classification.
arXiv Detail & Related papers (2021-05-26T17:01:52Z) - Improve Generalization and Robustness of Neural Networks via Weight
Scale Shifting Invariant Regularizations [52.493315075385325]
We show that a family of regularizers, including weight decay, is ineffective at penalizing the intrinsic norms of weights for networks with homogeneous activation functions.
We propose an improved regularizer that is invariant to weight scale shifting and thus effectively constrains the intrinsic norm of a neural network.
arXiv Detail & Related papers (2020-08-07T02:55:28Z) - SS-Auto: A Single-Shot, Automatic Structured Weight Pruning Framework of
DNNs with Ultra-High Efficiency [42.63352504047665]
We propose a framework to mitigate the limitations of structured weight pruning.
The proposed framework can achieve ultra-high rates while maintaining accuracy.
experiments on CIFARAR-100 datasets demonstrate that the proposed framework can achieve ultra-high accuracy.
arXiv Detail & Related papers (2020-01-23T22:45:02Z) - BLK-REW: A Unified Block-based DNN Pruning Framework using Reweighted
Regularization Method [69.49386965992464]
We propose a new block-based pruning framework that comprises a general and flexible structured pruning dimension as well as a powerful and efficient reweighted regularization method.
Our framework is universal, which can be applied to both CNNs and RNNs, implying complete support for the two major kinds ofintensive computation layers.
It is the first time that the weight pruning framework achieves universal coverage for both CNNs and RNNs with real-time mobile acceleration and no accuracy compromise.
arXiv Detail & Related papers (2020-01-23T03:30:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.