Related papers: A Framework for Semi-Automatic Precision and Accuracy Analysis for Fast and Rigorous Deep Learning

A Framework for Semi-Automatic Precision and Accuracy Analysis for Fast and Rigorous Deep Learning

URL: http://arxiv.org/abs/2002.03869v1
Date: Mon, 10 Feb 2020 15:33:19 GMT
Title: A Framework for Semi-Automatic Precision and Accuracy Analysis for Fast and Rigorous Deep Learning
Authors: Christoph Lauter and Anastasia Volkova
Abstract summary: Many papers experimentally observe that Deep Neural Networks (DNNs) can successfully run at almost ridiculously low precision. This paper sheds some theoretical light upon why a DNN's FP accuracy stays high for low FP precision. We present a software framework for FP error analysis for the inference phase of deep-learning.
Score: 1.5863809575305419
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep Neural Networks (DNN) represent a performance-hungry application. Floating-Point (FP) and custom floating-point-like arithmetic satisfies this hunger. While there is need for speed, inference in DNNs does not seem to have any need for precision. Many papers experimentally observe that DNNs can successfully run at almost ridiculously low precision. The aim of this paper is two-fold: first, to shed some theoretical light upon why a DNN's FP accuracy stays high for low FP precision. We observe that the loss of relative accuracy in the convolutional steps is recovered by the activation layers, which are extremely well-conditioned. We give an interpretation for the link between precision and accuracy in DNNs. Second, the paper presents a software framework for semi-automatic FP error analysis for the inference phase of deep-learning. Compatible with common Tensorflow/Keras models, it leverages the frugally-deep Python/C++ library to transform a neural network into C++ code in order to analyze the network's need for precision. This rigorous analysis is based on Interval and Affine arithmetics to compute absolute and relative error bounds for a DNN. We demonstrate our tool with several examples.

Related papers

Guaranteed Approximation Bounds for Mixed-Precision Neural Operators [83.64404557466528]
We build on intuition that neural operator learning inherently induces an approximation error. We show that our approach reduces GPU memory usage by up to 50% and improves throughput by 58% with little or no reduction in accuracy.
arXiv Detail & Related papers (2023-07-27T17:42:06Z)
Bridging Precision and Confidence: A Train-Time Loss for Calibrating Object Detection [58.789823426981044]
We propose a novel auxiliary loss formulation that aims to align the class confidence of bounding boxes with the accurateness of predictions. Our results reveal that our train-time loss surpasses strong calibration baselines in reducing calibration error for both in and out-domain scenarios.
arXiv Detail & Related papers (2023-03-25T08:56:21Z)
Automated machine learning for borehole resistivity measurements [0.0]
Deep neural networks (DNNs) offer a real-time solution for the inversion of borehole resistivity measurements. It is possible to use extremely large DNNs to approximate the operators, but it demands a considerable training time. In this work, we propose a scoring function that accounts for the accuracy and size of the DNNs.
arXiv Detail & Related papers (2022-07-20T12:27:22Z)
OMPQ: Orthogonal Mixed Precision Quantization [64.59700856607017]
Mixed precision quantization takes advantage of hardware's multiple bit-width arithmetic operations to unleash the full potential of network quantization. We propose to optimize a proxy metric, the concept of networkity, which is highly correlated with the loss of the integer programming. This approach reduces the search time and required data amount by orders of magnitude, with little compromise on quantization accuracy.
arXiv Detail & Related papers (2021-09-16T10:59:33Z)
Development of Quantized DNN Library for Exact Hardware Emulation [0.17188280334580192]
Quantization is used to speed up execution time and save power when runnning Deep neural networks (DNNs) on edge devices like AI chips. We have developed PyParch, a library that executes quantized DNNs with exactly the same be havior as hardware.
arXiv Detail & Related papers (2021-06-15T17:42:40Z)
AQD: Towards Accurate Fully-Quantized Object Detection [94.06347866374927]
We propose an Accurate Quantized object Detection solution, termed AQD, to get rid of floating-point computation. Our AQD achieves comparable or even better performance compared with the full-precision counterpart under extremely low-bit schemes.
arXiv Detail & Related papers (2020-07-14T09:07:29Z)
Efficient Integer-Arithmetic-Only Convolutional Neural Networks [87.01739569518513]
We replace conventional ReLU with Bounded ReLU and find that the decline is due to activation quantization. Our integer networks achieve equivalent performance as the corresponding FPN networks, but have only 1/4 memory cost and run 2x faster on modern GPU.
arXiv Detail & Related papers (2020-06-21T08:23:03Z)
Compressing deep neural networks on FPGAs to binary and ternary precision with HLS4ML [13.325670094073383]
We present the implementation of binary and ternary neural networks in the hls4ml library. We discuss the trade-off between model accuracy and resource consumption. The binary and ternary implementation has similar performance to the higher precision implementation while using drastically fewer FPGA resources.
arXiv Detail & Related papers (2020-03-11T10:46:51Z)
Precision Gating: Improving Neural Network Efficiency with Dynamic Dual-Precision Activations [22.71924873981158]
Precision gating (PG) is an end-to-end trainable dynamic dual-precision quantization technique for deep neural networks. PG achieves excellent results on CNNs, including statically compressed mobile-friendly networks such as ShuffleNet. Compared to 8-bit uniform quantization, PG obtains a 1.2% improvement in perplexity per word with 2.7$times$ computational cost reduction on LSTM.
arXiv Detail & Related papers (2020-02-17T18:54:37Z)
Approximation and Non-parametric Estimation of ResNet-type Convolutional Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes. We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.