Related papers: PLAM: a Posit Logarithm-Approximate Multiplier for Power Efficient Posit-based DNNs

PLAM: a Posit Logarithm-Approximate Multiplier for Power Efficient Posit-based DNNs

URL: http://arxiv.org/abs/2102.09262v1
Date: Thu, 18 Feb 2021 10:43:07 GMT
Title: PLAM: a Posit Logarithm-Approximate Multiplier for Power Efficient Posit-based DNNs
Authors: Raul Murillo, Alberto A. Del Barrio, Guillermo Botella, Min Soo Kim, HyunJin Kim and Nader Bagherzadeh
Abstract summary: The Posit Number System was introduced in 2017 as a replacement for floating-point numbers. This paper proposes a Posit Logarithm-Approximate multiplication scheme to significantly reduce the complexity of posit multipliers. Experiments show that the proposed technique reduces the area, power, and delay of hardware multipliers up to 72.86%, 81.79%, and 17.01%, respectively, without accuracy degradation.
Score: 8.623938357911467
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The Posit Number System was introduced in 2017 as a replacement for floating-point numbers. Since then, the community has explored its application in Neural Network related tasks and produced some unit designs which are still far from being competitive with their floating-point counterparts. This paper proposes a Posit Logarithm-Approximate Multiplication (PLAM) scheme to significantly reduce the complexity of posit multipliers, the most power-hungry units within Deep Neural Network architectures. When comparing with state-of-the-art posit multipliers, experiments show that the proposed technique reduces the area, power, and delay of hardware multipliers up to 72.86%, 81.79%, and 17.01%, respectively, without accuracy degradation.

Related papers

Low Power Approximate Multiplier Architecture for Deep Neural Networks [0.0]
A 4:2 compressor, introducing only a single combination error, is designed and integrated into an 8x8 unsigned multiplier.<n>The proposed multiplier is employed within a custom convolution layer and evaluated on neural network tasks, including image recognition and denoising.
arXiv Detail & Related papers (2025-08-31T09:25:42Z)
Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval. A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed. The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z)
An Efficient General-Purpose Optical Accelerator for Neural Networks [4.236129222287313]
General-purpose optical accelerators (GOAs) have emerged as a promising platform to accelerate deep neural networks (DNNs) In this work, a hybrid GOA architecture is proposed to enhance the mapping efficiency of neural networks onto the GOA. The energy consumption and computation latency can also be reduced by over 67% and 21%, respectively.
arXiv Detail & Related papers (2024-09-02T13:04:08Z)
PDPU: An Open-Source Posit Dot-Product Unit for Deep Learning Applications [9.253002604030085]
Posit has been a promising alternative to the IEEE-754 floating point format for deep learning applications. It has been implemented by either the combination of multipliers and an adder tree or cascaded fused multiply-add units, leading to poor computational efficiency and excessive hardware overhead. We propose an open-source posit dot-product unit, namely PDPU, that facilitates resource-efficient and high- throughput dot-product hardware implementation.
arXiv Detail & Related papers (2023-02-03T17:26:12Z)
Low-bit Shift Network for End-to-End Spoken Language Understanding [7.851607739211987]
We propose the use of power-of-two quantization, which quantizes continuous parameters into low-bit power-of-two values. This reduces computational complexity by removing expensive multiplication operations and with the use of low-bit weights.
arXiv Detail & Related papers (2022-07-15T14:34:22Z)
A Survey of Quantization Methods for Efficient Neural Network Inference [75.55159744950859]
quantization is the problem of distributing continuous real-valued numbers over a fixed discrete set of numbers to minimize the number of bits required. It has come to the forefront in recent years due to the remarkable performance of Neural Network models in computer vision, natural language processing, and related areas. Moving from floating-point representations to low-precision fixed integer values represented in four bits or less holds the potential to reduce the memory footprint and latency by a factor of 16x.
arXiv Detail & Related papers (2021-03-25T06:57:11Z)
Ps and Qs: Quantization-aware pruning for efficient low latency neural network inference [56.24109486973292]
We study the interplay between pruning and quantization during the training of neural networks for ultra low latency applications. We find that quantization-aware pruning yields more computationally efficient models than either pruning or quantization alone for our task.
arXiv Detail & Related papers (2021-02-22T19:00:05Z)
ExPAN(N)D: Exploring Posits for Efficient Artificial Neural Network Design in FPGA-based Systems [4.2612881037640085]
This paper analyzes and ingathers the efficacy of the Posit number representation scheme and the efficiency of fixed-point arithmetic implementations for ANNs. We propose a novel Posit to fixed-point converter for enabling high-performance and energy-efficient hardware implementations for ANNs.
arXiv Detail & Related papers (2020-10-24T11:02:25Z)
Temporal Attention-Augmented Graph Convolutional Network for Efficient Skeleton-Based Human Action Recognition [97.14064057840089]
Graphal networks (GCNs) have been very successful in modeling non-Euclidean data structures. Most GCN-based action recognition methods use deep feed-forward networks with high computational complexity to process all skeletons in an action. We propose a temporal attention module (TAM) for increasing the efficiency in skeleton-based action recognition.
arXiv Detail & Related papers (2020-10-23T08:01:55Z)
Floating-Point Multiplication Using Neuromorphic Computing [3.5450828190071655]
We describe a neuromorphic system that performs IEEE 754-compliant floating-point multiplication. We study the effect of the number of neurons per bit on accuracy and bit error rate, and estimate the optimal number of neurons needed for each component.
arXiv Detail & Related papers (2020-08-30T19:07:14Z)
ALF: Autoencoder-based Low-rank Filter-sharing for Efficient Convolutional Neural Networks [63.91384986073851]
We propose the autoencoder-based low-rank filter-sharing technique technique (ALF) ALF shows a reduction of 70% in network parameters, 61% in operations and 41% in execution time, with minimal loss in accuracy.
arXiv Detail & Related papers (2020-07-27T09:01:22Z)
WrapNet: Neural Net Inference with Ultra-Low-Resolution Arithmetic [57.07483440807549]
We propose a method that adapts neural networks to use low-resolution (8-bit) additions in the accumulators, achieving classification accuracy comparable to their 32-bit counterparts. We demonstrate the efficacy of our approach on both software and hardware platforms.
arXiv Detail & Related papers (2020-07-26T23:18:38Z)
AQD: Towards Accurate Fully-Quantized Object Detection [94.06347866374927]
We propose an Accurate Quantized object Detection solution, termed AQD, to get rid of floating-point computation. Our AQD achieves comparable or even better performance compared with the full-precision counterpart under extremely low-bit schemes.
arXiv Detail & Related papers (2020-07-14T09:07:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.