Generalizable Mixed-Precision Quantization via Attribution Rank
Preservation
- URL: http://arxiv.org/abs/2108.02720v1
- Date: Thu, 5 Aug 2021 16:41:57 GMT
- Title: Generalizable Mixed-Precision Quantization via Attribution Rank
Preservation
- Authors: Ziwei Wang, Han Xiao, Jiwen Lu, Jie Zhou
- Abstract summary: We propose a generalizable mixed-precision quantization (GMPQ) method for efficient inference.
Our method obtains competitive accuracy-complexity trade-off compared with the state-of-the-art mixed-precision networks.
- Score: 90.26603048354575
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a generalizable mixed-precision quantization (GMPQ)
method for efficient inference. Conventional methods require the consistency of
datasets for bitwidth search and model deployment to guarantee the policy
optimality, leading to heavy search cost on challenging largescale datasets in
realistic applications. On the contrary, our GMPQ searches the
mixed-quantization policy that can be generalized to largescale datasets with
only a small amount of data, so that the search cost is significantly reduced
without performance degradation. Specifically, we observe that locating network
attribution correctly is general ability for accurate visual analysis across
different data distribution. Therefore, despite of pursuing higher model
accuracy and complexity, we preserve attribution rank consistency between the
quantized models and their full-precision counterparts via efficient
capacity-aware attribution imitation for generalizable mixed-precision
quantization strategy search. Extensive experiments show that our method
obtains competitive accuracy-complexity trade-off compared with the
state-of-the-art mixed-precision networks in significantly reduced search cost.
The code is available at https://github.com/ZiweiWangTHU/GMPQ.git.
Related papers
- RGM: A Robust Generalizable Matching Model [49.60975442871967]
We propose a deep model for sparse and dense matching, termed RGM (Robust Generalist Matching)
To narrow the gap between synthetic training samples and real-world scenarios, we build a new, large-scale dataset with sparse correspondence ground truth.
We are able to mix up various dense and sparse matching datasets, significantly improving the training diversity.
arXiv Detail & Related papers (2023-10-18T07:30:08Z) - Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data.
The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships.
A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z) - Improved Distribution Matching for Dataset Condensation [91.55972945798531]
We propose a novel dataset condensation method based on distribution matching.
Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources.
arXiv Detail & Related papers (2023-07-19T04:07:33Z) - Learning Accurate Performance Predictors for Ultrafast Automated Model
Compression [86.22294249097203]
We propose an ultrafast automated model compression framework called SeerNet for flexible network deployment.
Our method achieves competitive accuracy-complexity trade-offs with significant reduction of the search cost.
arXiv Detail & Related papers (2023-04-13T10:52:49Z) - SEAM: Searching Transferable Mixed-Precision Quantization Policy through
Large Margin Regularization [50.04951511146338]
Mixed-precision quantization (MPQ) suffers from the time-consuming process of searching the optimal bit-width allocation for each layer.
This paper proposes a novel method for efficiently searching for effective MPQ policies using a small proxy dataset.
arXiv Detail & Related papers (2023-02-14T05:47:45Z) - Pretrained Cost Model for Distributed Constraint Optimization Problems [37.79733538931925]
Distributed Constraint Optimization Problems (DCOPs) are an important subclass of optimization problems.
We propose a novel directed acyclic graph schema representation for DCOPs and leverage the Graph Attention Networks (GATs) to embed graph representations.
Our model, GAT-PCM, is then pretrained with optimally labelled data in an offline manner, so as to boost a broad range of DCOP algorithms.
arXiv Detail & Related papers (2021-12-08T09:24:10Z) - Effective and Fast: A Novel Sequential Single Path Search for
Mixed-Precision Quantization [45.22093693422085]
Mixed-precision quantization model can match different quantization bit-precisions according to the sensitivity of different layers to achieve great performance.
It is a difficult problem to quickly determine the quantization bit-precision of each layer in deep neural networks according to some constraints.
We propose a novel sequential single path search (SSPS) method for mixed-precision quantization.
arXiv Detail & Related papers (2021-03-04T09:15:08Z) - A Partial Regularization Method for Network Compression [0.0]
We propose an approach of partial regularization rather than the original form of penalizing all parameters, which is said to be full regularization, to conduct model compression at a higher speed.
Experimental results show that as we expected, the computational complexity is reduced by observing less running time in almost all situations.
Surprisingly, it helps to improve some important metrics such as regression fitting results and classification accuracy in both training and test phases on multiple datasets.
arXiv Detail & Related papers (2020-09-03T00:38:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.