A Framework for Semi-Automatic Precision and Accuracy Analysis for Fast
and Rigorous Deep Learning
- URL: http://arxiv.org/abs/2002.03869v1
- Date: Mon, 10 Feb 2020 15:33:19 GMT
- Title: A Framework for Semi-Automatic Precision and Accuracy Analysis for Fast
and Rigorous Deep Learning
- Authors: Christoph Lauter and Anastasia Volkova
- Abstract summary: Many papers experimentally observe that Deep Neural Networks (DNNs) can successfully run at almost ridiculously low precision.
This paper sheds some theoretical light upon why a DNN's FP accuracy stays high for low FP precision.
We present a software framework for FP error analysis for the inference phase of deep-learning.
- Score: 1.5863809575305419
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Neural Networks (DNN) represent a performance-hungry application.
Floating-Point (FP) and custom floating-point-like arithmetic satisfies this
hunger. While there is need for speed, inference in DNNs does not seem to have
any need for precision. Many papers experimentally observe that DNNs can
successfully run at almost ridiculously low precision.
The aim of this paper is two-fold: first, to shed some theoretical light upon
why a DNN's FP accuracy stays high for low FP precision. We observe that the
loss of relative accuracy in the convolutional steps is recovered by the
activation layers, which are extremely well-conditioned. We give an
interpretation for the link between precision and accuracy in DNNs.
Second, the paper presents a software framework for semi-automatic FP error
analysis for the inference phase of deep-learning. Compatible with common
Tensorflow/Keras models, it leverages the frugally-deep Python/C++ library to
transform a neural network into C++ code in order to analyze the network's need
for precision. This rigorous analysis is based on Interval and Affine
arithmetics to compute absolute and relative error bounds for a DNN. We
demonstrate our tool with several examples.
Related papers
- Guaranteed Approximation Bounds for Mixed-Precision Neural Operators [83.64404557466528]
We build on intuition that neural operator learning inherently induces an approximation error.
We show that our approach reduces GPU memory usage by up to 50% and improves throughput by 58% with little or no reduction in accuracy.
arXiv Detail & Related papers (2023-07-27T17:42:06Z) - Bridging Precision and Confidence: A Train-Time Loss for Calibrating
Object Detection [58.789823426981044]
We propose a novel auxiliary loss formulation that aims to align the class confidence of bounding boxes with the accurateness of predictions.
Our results reveal that our train-time loss surpasses strong calibration baselines in reducing calibration error for both in and out-domain scenarios.
arXiv Detail & Related papers (2023-03-25T08:56:21Z) - Automated machine learning for borehole resistivity measurements [0.0]
Deep neural networks (DNNs) offer a real-time solution for the inversion of borehole resistivity measurements.
It is possible to use extremely large DNNs to approximate the operators, but it demands a considerable training time.
In this work, we propose a scoring function that accounts for the accuracy and size of the DNNs.
arXiv Detail & Related papers (2022-07-20T12:27:22Z) - OMPQ: Orthogonal Mixed Precision Quantization [64.59700856607017]
Mixed precision quantization takes advantage of hardware's multiple bit-width arithmetic operations to unleash the full potential of network quantization.
We propose to optimize a proxy metric, the concept of networkity, which is highly correlated with the loss of the integer programming.
This approach reduces the search time and required data amount by orders of magnitude, with little compromise on quantization accuracy.
arXiv Detail & Related papers (2021-09-16T10:59:33Z) - Development of Quantized DNN Library for Exact Hardware Emulation [0.17188280334580192]
Quantization is used to speed up execution time and save power when runnning Deep neural networks (DNNs) on edge devices like AI chips.
We have developed PyParch, a library that executes quantized DNNs with exactly the same be havior as hardware.
arXiv Detail & Related papers (2021-06-15T17:42:40Z) - AQD: Towards Accurate Fully-Quantized Object Detection [94.06347866374927]
We propose an Accurate Quantized object Detection solution, termed AQD, to get rid of floating-point computation.
Our AQD achieves comparable or even better performance compared with the full-precision counterpart under extremely low-bit schemes.
arXiv Detail & Related papers (2020-07-14T09:07:29Z) - Efficient Integer-Arithmetic-Only Convolutional Neural Networks [87.01739569518513]
We replace conventional ReLU with Bounded ReLU and find that the decline is due to activation quantization.
Our integer networks achieve equivalent performance as the corresponding FPN networks, but have only 1/4 memory cost and run 2x faster on modern GPU.
arXiv Detail & Related papers (2020-06-21T08:23:03Z) - Compressing deep neural networks on FPGAs to binary and ternary
precision with HLS4ML [13.325670094073383]
We present the implementation of binary and ternary neural networks in the hls4ml library.
We discuss the trade-off between model accuracy and resource consumption.
The binary and ternary implementation has similar performance to the higher precision implementation while using drastically fewer FPGA resources.
arXiv Detail & Related papers (2020-03-11T10:46:51Z) - Precision Gating: Improving Neural Network Efficiency with Dynamic
Dual-Precision Activations [22.71924873981158]
Precision gating (PG) is an end-to-end trainable dynamic dual-precision quantization technique for deep neural networks.
PG achieves excellent results on CNNs, including statically compressed mobile-friendly networks such as ShuffleNet.
Compared to 8-bit uniform quantization, PG obtains a 1.2% improvement in perplexity per word with 2.7$times$ computational cost reduction on LSTM.
arXiv Detail & Related papers (2020-02-17T18:54:37Z) - Approximation and Non-parametric Estimation of ResNet-type Convolutional
Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes.
We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.