Characterizing and Understanding the Behavior of Quantized Models for
Reliable Deployment
- URL: http://arxiv.org/abs/2204.04220v1
- Date: Fri, 8 Apr 2022 11:19:16 GMT
- Title: Characterizing and Understanding the Behavior of Quantized Models for
Reliable Deployment
- Authors: Qiang Hu, Yuejun Guo, Maxime Cordy, Xiaofei Xie, Wei Ma, Mike
Papadakis, Yves Le Traon
- Abstract summary: Quantization-aware training can produce more stable models than standard, adversarial, and Mixup training.
Disagreements often have closer top-1 and top-2 output probabilities, and $Margin$ is a better indicator than the other uncertainty metrics to distinguish disagreements.
We opensource our code and models as a new benchmark for further studying the quantized models.
- Score: 32.01355605506855
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Neural Networks (DNNs) have gained considerable attention in the past
decades due to their astounding performance in different applications, such as
natural language modeling, self-driving assistance, and source code
understanding. With rapid exploration, more and more complex DNN architectures
have been proposed along with huge pre-trained model parameters. The common way
to use such DNN models in user-friendly devices (e.g., mobile phones) is to
perform model compression before deployment. However, recent research has
demonstrated that model compression, e.g., model quantization, yields accuracy
degradation as well as outputs disagreements when tested on unseen data. Since
the unseen data always include distribution shifts and often appear in the
wild, the quality and reliability of quantized models are not ensured. In this
paper, we conduct a comprehensive study to characterize and help users
understand the behaviors of quantized models. Our study considers 4 datasets
spanning from image to text, 8 DNN architectures including feed-forward neural
networks and recurrent neural networks, and 42 shifted sets with both synthetic
and natural distribution shifts. The results reveal that 1) data with
distribution shifts happen more disagreements than without. 2)
Quantization-aware training can produce more stable models than standard,
adversarial, and Mixup training. 3) Disagreements often have closer top-1 and
top-2 output probabilities, and $Margin$ is a better indicator than the other
uncertainty metrics to distinguish disagreements. 4) Retraining with
disagreements has limited efficiency in removing disagreements. We opensource
our code and models as a new benchmark for further studying the quantized
models.
Related papers
- ScatterUQ: Interactive Uncertainty Visualizations for Multiclass Deep Learning Problems [0.0]
ScatterUQ is an interactive system that provides targeted visualizations to allow users to better understand model performance in context-driven uncertainty settings.
We demonstrate the effectiveness of ScatterUQ to explain model uncertainty for a multiclass image classification on a distance-aware neural network trained on Fashion-MNIST.
Our results indicate that the ScatterUQ system should scale to arbitrary, multiclass datasets.
arXiv Detail & Related papers (2023-08-08T21:17:03Z) - Neural Additive Models for Location Scale and Shape: A Framework for
Interpretable Neural Regression Beyond the Mean [1.0923877073891446]
Deep neural networks (DNNs) have proven to be highly effective in a variety of tasks.
Despite this success, the inner workings of DNNs are often not transparent.
This lack of interpretability has led to increased research on inherently interpretable neural networks.
arXiv Detail & Related papers (2023-01-27T17:06:13Z) - A Tale of Two Cities: Data and Configuration Variances in Robust Deep
Learning [27.498927971861068]
Deep neural networks (DNNs) are widely used in many industries such as image recognition, supply chain, medical diagnosis, and autonomous driving.
Prior work has shown the high accuracy of a DNN model does not imply high robustness because the input data and external environment are constantly changing.
arXiv Detail & Related papers (2022-11-18T03:32:53Z) - Two-stage Modeling for Prediction with Confidence [0.0]
It is difficult to generalize the performance of neural networks under the condition of distributional shift.
We propose a novel two-stage model for the potential distribution shift problem.
We show that our model offers reliable predictions for the vast majority of datasets.
arXiv Detail & Related papers (2022-09-19T08:48:07Z) - Part-Based Models Improve Adversarial Robustness [57.699029966800644]
We show that combining human prior knowledge with end-to-end learning can improve the robustness of deep neural networks.
Our model combines a part segmentation model with a tiny classifier and is trained end-to-end to simultaneously segment objects into parts.
Our experiments indicate that these models also reduce texture bias and yield better robustness against common corruptions and spurious correlations.
arXiv Detail & Related papers (2022-09-15T15:41:47Z) - Investigating the Relationship Between Dropout Regularization and Model
Complexity in Neural Networks [0.0]
Dropout Regularization serves to reduce variance in Deep Learning models.
We explore the relationship between the dropout rate and model complexity by training 2,000 neural networks.
We build neural networks that predict the optimal dropout rate given the number of hidden units in each dense layer.
arXiv Detail & Related papers (2021-08-14T23:49:33Z) - Closed-form Continuous-Depth Models [99.40335716948101]
Continuous-depth neural models rely on advanced numerical differential equation solvers.
We present a new family of models, termed Closed-form Continuous-depth (CfC) networks, that are simple to describe and at least one order of magnitude faster.
arXiv Detail & Related papers (2021-06-25T22:08:51Z) - ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked
Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware.
The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation.
We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z) - Neural Additive Models: Interpretable Machine Learning with Neural Nets [77.66871378302774]
Deep neural networks (DNNs) are powerful black-box predictors that have achieved impressive performance on a wide variety of tasks.
We propose Neural Additive Models (NAMs) which combine some of the expressivity of DNNs with the inherent intelligibility of generalized additive models.
NAMs learn a linear combination of neural networks that each attend to a single input feature.
arXiv Detail & Related papers (2020-04-29T01:28:32Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z) - AvgOut: A Simple Output-Probability Measure to Eliminate Dull Responses [97.50616524350123]
We build dialogue models that are dynamically aware of what utterances or tokens are dull without any feature-engineering.
The first model, MinAvgOut, directly maximizes the diversity score through the output distributions of each batch.
The second model, Label Fine-Tuning (LFT), prepends to the source sequence a label continuously scaled by the diversity score to control the diversity level.
The third model, RL, adopts Reinforcement Learning and treats the diversity score as a reward signal.
arXiv Detail & Related papers (2020-01-15T18:32:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.