Efficient and Mathematically Robust Operations for Certified Neural
Networks Inference
- URL: http://arxiv.org/abs/2401.08225v1
- Date: Tue, 16 Jan 2024 09:22:38 GMT
- Title: Efficient and Mathematically Robust Operations for Certified Neural
Networks Inference
- Authors: Fabien Geyer, Johannes Freitag, Tobias Schulz, Sascha Uhrig
- Abstract summary: Concerns about certification have come up for machine learning (ML) and neural networks (NNs)
This paper delves into the inference stage and the requisite hardware, highlighting the challenges associated with IEEE 754 floating-point arithmetic.
By evaluating diverse summation and dot product algorithms, we aim to mitigate issues related to non-associativity.
- Score: 3.666326242924816
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, machine learning (ML) and neural networks (NNs) have gained
widespread use and attention across various domains, particularly in
transportation for achieving autonomy, including the emergence of flying taxis
for urban air mobility (UAM). However, concerns about certification have come
up, compelling the development of standardized processes encompassing the
entire ML and NN pipeline. This paper delves into the inference stage and the
requisite hardware, highlighting the challenges associated with IEEE 754
floating-point arithmetic and proposing alternative number representations. By
evaluating diverse summation and dot product algorithms, we aim to mitigate
issues related to non-associativity. Additionally, our exploration of
fixed-point arithmetic reveals its advantages over floating-point methods,
demonstrating significant hardware efficiencies. Employing an empirical
approach, we ascertain the optimal bit-width necessary to attain an acceptable
level of accuracy, considering the inherent complexity of bit-width
optimization.
Related papers
- Energy-Aware FPGA Implementation of Spiking Neural Network with LIF Neurons [0.5243460995467893]
Spiking Neural Networks (SNNs) stand out as a cutting-edge solution for TinyML.
This paper presents a novel SNN architecture based on the 1st Order Leaky Integrate-and-Fire (LIF) neuron model.
A hardware-friendly LIF design is also proposed, and implemented on a Xilinx Artix-7 FPGA.
arXiv Detail & Related papers (2024-11-03T16:42:10Z) - PIPE : Parallelized Inference Through Post-Training Quantization
Ensembling of Residual Expansions [23.1120983784623]
PIPE is a quantization method that leverages residual error expansion, along with group sparsity and an ensemble approximation for better parallelization.
It achieves superior performance on every benchmarked application (from vision to NLP tasks), architecture (ConvNets, transformers) and bit-width.
arXiv Detail & Related papers (2023-11-27T13:29:34Z) - Optimization Guarantees of Unfolded ISTA and ADMM Networks With Smooth
Soft-Thresholding [57.71603937699949]
We study optimization guarantees, i.e., achieving near-zero training loss with the increase in the number of learning epochs.
We show that the threshold on the number of training samples increases with the increase in the network width.
arXiv Detail & Related papers (2023-09-12T13:03:47Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - Energy-efficient Task Adaptation for NLP Edge Inference Leveraging
Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks.
We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z) - FiFo: Fishbone Forwarding in Massive IoT Networks [3.7813000491275233]
Massive Internet of Things (IoT) networks have a wide range of applications, including but not limited to the rapid delivery of emergency and disaster messages.
benchmark algorithms have been developed to date for message delivery in such applications, but they pose several practical challenges.
We propose a novel and effective forwarding method, fishbone forwarding (FiFo), which aims to improve the forwarding efficiency with acceptable computational complexity.
arXiv Detail & Related papers (2022-11-02T15:55:17Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - Task-Oriented Sensing, Computation, and Communication Integration for
Multi-Device Edge AI [108.08079323459822]
This paper studies a new multi-intelligent edge artificial-latency (AI) system, which jointly exploits the AI model split inference and integrated sensing and communication (ISAC)
We measure the inference accuracy by adopting an approximate but tractable metric, namely discriminant gain.
arXiv Detail & Related papers (2022-07-03T06:57:07Z) - ExPAN(N)D: Exploring Posits for Efficient Artificial Neural Network
Design in FPGA-based Systems [4.2612881037640085]
This paper analyzes and ingathers the efficacy of the Posit number representation scheme and the efficiency of fixed-point arithmetic implementations for ANNs.
We propose a novel Posit to fixed-point converter for enabling high-performance and energy-efficient hardware implementations for ANNs.
arXiv Detail & Related papers (2020-10-24T11:02:25Z) - Distributed Optimization, Averaging via ADMM, and Network Topology [0.0]
We study the connection between network topology and convergence rates for different algorithms on a real world problem of sensor localization.
We also show interesting connections between ADMM and lifted Markov chains besides providing an explicitly characterization of its convergence.
arXiv Detail & Related papers (2020-09-05T21:44:39Z) - AQD: Towards Accurate Fully-Quantized Object Detection [94.06347866374927]
We propose an Accurate Quantized object Detection solution, termed AQD, to get rid of floating-point computation.
Our AQD achieves comparable or even better performance compared with the full-precision counterpart under extremely low-bit schemes.
arXiv Detail & Related papers (2020-07-14T09:07:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.