Related papers: SEAM: Searching Transferable Mixed-Precision Quantization Policy through Large Margin Regularization

SEAM: Searching Transferable Mixed-Precision Quantization Policy through Large Margin Regularization

URL: http://arxiv.org/abs/2302.06845v2
Date: Wed, 23 Aug 2023 03:56:24 GMT
Title: SEAM: Searching Transferable Mixed-Precision Quantization Policy through Large Margin Regularization
Authors: Chen Tang, Kai Ouyang, Zenghao Chai, Yunpeng Bai, Yuan Meng, Zhi Wang, Wenwu Zhu
Abstract summary: Mixed-precision quantization (MPQ) suffers from the time-consuming process of searching the optimal bit-width allocation for each layer. This paper proposes a novel method for efficiently searching for effective MPQ policies using a small proxy dataset.
Score: 50.04951511146338
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Mixed-precision quantization (MPQ) suffers from the time-consuming process of searching the optimal bit-width allocation i.e., the policy) for each layer, especially when using large-scale datasets such as ISLVRC-2012. This limits the practicality of MPQ in real-world deployment scenarios. To address this issue, this paper proposes a novel method for efficiently searching for effective MPQ policies using a small proxy dataset instead of the large-scale dataset used for training the model. Deviating from the established norm of employing a consistent dataset for both model training and MPQ policy search stages, our approach, therefore, yields a substantial enhancement in the efficiency of MPQ exploration. Nonetheless, using discrepant datasets poses challenges in searching for a transferable MPQ policy. Driven by the observation that quantization noise of sub-optimal policy exerts a detrimental influence on the discriminability of feature representations -- manifesting as diminished class margins and ambiguous decision boundaries -- our method aims to identify policies that uphold the discriminative nature of feature representations, i.e., intra-class compactness and inter-class separation. This general and dataset-independent property makes us search for the MPQ policy over a rather small-scale proxy dataset and then the policy can be directly used to quantize the model trained on a large-scale dataset. Our method offers several advantages, including high proxy data utilization, no excessive hyper-parameter tuning, and high searching efficiency. We search high-quality MPQ policies with the proxy dataset that has only 4% of the data scale compared to the large-scale target dataset, achieving the same accuracy as searching directly on the latter, improving MPQ searching efficiency by up to 300 times.

Related papers

IMLE Policy: Fast and Sample Efficient Visuomotor Policy Learning via Implicit Maximum Likelihood Estimation [3.7584322469996896]
IMLE Policy is a novel behaviour cloning approach based on Implicit Maximum Likelihood Estimation (IMLE) It excels in low-data regimes, effectively learning from minimal demonstrations and requiring 38% less data on average to match the performance of baseline methods in learning complex multi-modal behaviours. We validate our approach across diverse manipulation tasks in simulated and real-world environments, showcasing its ability to capture complex behaviours under data constraints.
arXiv Detail & Related papers (2025-02-17T23:22:49Z)
Data Selection via Optimal Control for Language Models [134.67665351539725]
This work investigates the selection of high-quality pre-training data from massive corpora to enhance LMs' capabilities for downstream usage. We introduce PMP-based Data Selection (PDS), a framework that approximates optimal data selection by solving the PMP conditions. The benefits of PDS extend to 400B models trained on 10T tokens, as evidenced by the extrapolation of the test loss curves according to the Scaling Laws.
arXiv Detail & Related papers (2024-10-09T17:06:57Z)
Minimally Supervised Learning using Topological Projections in Self-Organizing Maps [55.31182147885694]
We introduce a semi-supervised learning approach based on topological projections in self-organizing maps (SOMs) Our proposed method first trains SOMs on unlabeled data and then a minimal number of available labeled data points are assigned to key best matching units (BMU) Our results indicate that the proposed minimally supervised model significantly outperforms traditional regression techniques.
arXiv Detail & Related papers (2024-01-12T22:51:48Z)
MeaeQ: Mount Model Extraction Attacks with Efficient Queries [6.1106195466129485]
We study model extraction attacks in natural language processing (NLP) We propose MeaeQ, a straightforward yet effective method to address these issues. MeaeQ achieves higher functional similarity to the victim model than baselines while requiring fewer queries.
arXiv Detail & Related papers (2023-10-21T16:07:16Z)
An Improved Data Augmentation Scheme for Model Predictive Control Policy Approximation [0.0]
A sensitivity-based data augmentation framework for MPC policy approximation was proposed. The error due to augmenting the training data set with inexact samples was shown to increase with the size of the neighborhood. This paper presents an improved data augmentation scheme based on predictor-corrector steps that enforces a user-defined level of accuracy.
arXiv Detail & Related papers (2023-03-09T22:16:47Z)
SDQ: Stochastic Differentiable Quantization with Mixed Precision [46.232003346732064]
We present a novel Differentiable Quantization (SDQ) method that can automatically learn the MPQ strategy. After the optimal MPQ strategy is acquired, we train our network with entropy-aware bin regularization and knowledge distillation. SDQ outperforms all state-of-the-art mixed datasets or single precision quantization with a lower bitwidth.
arXiv Detail & Related papers (2022-06-09T12:38:18Z)
Generalizable Mixed-Precision Quantization via Attribution Rank Preservation [90.26603048354575]
We propose a generalizable mixed-precision quantization (GMPQ) method for efficient inference. Our method obtains competitive accuracy-complexity trade-off compared with the state-of-the-art mixed-precision networks.
arXiv Detail & Related papers (2021-08-05T16:41:57Z)
Noise-Resistant Deep Metric Learning with Probabilistic Instance Filtering [59.286567680389766]
Noisy labels are commonly found in real-world data, which cause performance degradation of deep neural networks. We propose Probabilistic Ranking-based Instance Selection with Memory (PRISM) approach for DML. PRISM calculates the probability of a label being clean, and filters out potentially noisy samples.
arXiv Detail & Related papers (2021-08-03T12:15:25Z)
Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient [62.24615324523435]
This paper provides a statistical analysis of high-dimensional batch Reinforcement Learning (RL) using sparse linear function approximation. When there is a large number of candidate features, our result sheds light on the fact that sparsity-aware methods can make batch RL more sample efficient.
arXiv Detail & Related papers (2020-11-08T16:48:02Z)
DeepSampling: Selectivity Estimation with Predicted Error and Response Time [7.23389716633927]
This paper proposes DeepSampling, a deep-learning-based model that predicts the accuracy of a sample-based AQP algorithm. DeepSampling is the first system that provides a reliable tool for existing spatial databases to control the accuracy of AQP.
arXiv Detail & Related papers (2020-08-16T03:23:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.