Related papers: MajorityNets: BNNs Utilising Approximate Popcount for Improved Efficiency

MajorityNets: BNNs Utilising Approximate Popcount for Improved Efficiency

URL: http://arxiv.org/abs/2002.12900v1
Date: Thu, 27 Feb 2020 04:02:43 GMT
Title: MajorityNets: BNNs Utilising Approximate Popcount for Improved Efficiency
Authors: Seyedramin Rasoulinezhad, Sean Fox, Hao Zhou, Lingli Wang, David Boland, Philip H.W. Leong
Abstract summary: This paper proposes a smaller, faster, more energy-efficient approximate replacement for the XnorPopcount operation, called XNorMaj. We show that XNorMaj is up to 2x more resource-efficient than the XnorPopcount operation.
Score: 13.186127108769615
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Binarized neural networks (BNNs) have shown exciting potential for utilising neural networks in embedded implementations where area, energy and latency constraints are paramount. With BNNs, multiply-accumulate (MAC) operations can be simplified to XnorPopcount operations, leading to massive reductions in both memory and computation resources. Furthermore, multiple efficient implementations of BNNs have been reported on field-programmable gate array (FPGA) implementations. This paper proposes a smaller, faster, more energy-efficient approximate replacement for the XnorPopcountoperation, called XNorMaj, inspired by state-of-the-art FPGAlook-up table schemes which benefit FPGA implementations. Weshow that XNorMaj is up to 2x more resource-efficient than the XnorPopcount operation. While the XNorMaj operation has a minor detrimental impact on accuracy, the resource savings enable us to use larger networks to recover the loss.

Related papers

S$^2$NN: Sub-bit Spiking Neural Networks [53.08060832135342]
Spiking Neural Networks (SNNs) offer an energy-efficient paradigm for machine intelligence.<n>Despite recent advances in binary SNNs, the storage and computational demands remain substantial for large-scale networks.<n>We propose Sub-bit Spiking Neural Networks (S$2$NNs) that represent weights with less than one bit.
arXiv Detail & Related papers (2025-09-29T04:17:44Z)
da4ml: Distributed Arithmetic for Real-time Neural Networks on FPGAs [5.979741271992278]
We propose an efficient algorithm for implementing constant matrix-vector multiplication (CMVM) operations with distributed arithmetic (DA) on FPGAs.<n>The algorithm achieves resource reduction similar to state-of-the-art algorithms while being significantly faster to compute.<n>We show that the proposed algorithm can reduce on-chip resources by up to a third for realistic, highly quantized neural networks while simultaneously reducing latency.
arXiv Detail & Related papers (2025-07-06T21:01:32Z)
NeuraLUT: Hiding Neural Network Density in Boolean Synthesizable Functions [2.7086888205833968]
Field-Programmable Gate Array (FPGA) accelerators have proven successful in handling latency- and resource-critical deep neural network (DNN) inference tasks. We propose relaxing the boundaries of neurons and mapping entire sub-networks to a single LUT. We validate our proposed method on a known latency-critical task, jet substructure tagging, and on the classical computer vision task, digit classification using MNIST.
arXiv Detail & Related papers (2024-02-29T16:10:21Z)
An Optical XNOR-Bitcount Based Accelerator for Efficient Inference of Binary Neural Networks [0.0]
We invent a single-MRR-based optical XNOR gate (OXG) We present a novel design of bitcount circuit which we refer to as Photo-Charge Accumulator (PCA) Our evaluation for the inference of four modern BNNs indicates that OXBNN provides improvements of up to 62x and 7.6x in frames-per-second (FPS) and FPS/W (energy efficiency)
arXiv Detail & Related papers (2023-02-03T20:56:01Z)
FireFly: A High-Throughput Hardware Accelerator for Spiking Neural Networks with Efficient DSP and Memory Optimization [6.966706170499345]
Spiking neural networks (SNNs) have been widely used due to their strong biological interpretability and high energy efficiency. Most SNN hardware implementations for field-programmable gate arrays (FPGAs) cannot meet arithmetic or memory efficiency requirements. We propose an FPGA accelerator that can process spikes generated by the firing neuron on-the-fly (FireFly)
arXiv Detail & Related papers (2023-01-05T04:28:07Z)
Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors. Our work is the first attempt to optimize BNNs from the bilinear perspective. We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z)
POEM: 1-bit Point-wise Operations based on Expectation-Maximization for Efficient Point Cloud Processing [53.74076015905961]
We introduce point-wise operations based on Expectation-Maximization into BNNs for efficient point cloud processing. Our POEM surpasses existing the state-of-the-art binary point cloud networks by a significant margin, up to 6.7 %.
arXiv Detail & Related papers (2021-11-26T09:45:01Z)
Sub-bit Neural Networks: Learning to Compress and Accelerate Binary Neural Networks [72.81092567651395]
Sub-bit Neural Networks (SNNs) are a new type of binary quantization design tailored to compress and accelerate BNNs. SNNs are trained with a kernel-aware optimization framework, which exploits binary quantization in the fine-grained convolutional kernel space. Experiments on visual recognition benchmarks and the hardware deployment on FPGA validate the great potentials of SNNs.
arXiv Detail & Related papers (2021-10-18T11:30:29Z)
Quantized Neural Networks via {-1, +1} Encoding Decomposition and Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks. We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z)
NullaNet Tiny: Ultra-low-latency DNN Inference Through Fixed-function Combinational Logic [4.119948826527649]
Field-programmable gate array (FPGA)-based accelerators are gaining traction as a serious contender to replace graphics processing unit/central processing unit-based platforms. This paper presents NullaNet Tiny, a framework for constructing resource and energy-efficient, ultra-low-latency FPGA-based neural network accelerators.
arXiv Detail & Related papers (2021-04-07T00:16:39Z)
FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with Fractional Activations [20.218382369944152]
Binary neural networks (BNNs) have 1-bit weights and activations. BNNs tend to produce a much lower accuracy on realistic datasets such as ImageNet. This work proposes FracBNN, which exploits fractional activations to substantially improve the accuracy of BNNs.
arXiv Detail & Related papers (2020-12-22T17:49:30Z)
ShiftAddNet: A Hardware-Inspired Deep Network [87.18216601210763]
ShiftAddNet is an energy-efficient multiplication-less deep neural network. It leads to both energy-efficient inference and training, without compromising expressive capacity. ShiftAddNet aggressively reduces over 80% hardware-quantified energy cost of DNNs training and inference, while offering comparable or better accuracies.
arXiv Detail & Related papers (2020-10-24T05:09:14Z)
Binarized Graph Neural Network [65.20589262811677]
We develop a binarized graph neural network to learn the binary representations of the nodes with binary network parameters. Our proposed method can be seamlessly integrated into the existing GNN-based embedding approaches. Experiments indicate that the proposed binarized graph neural network, namely BGN, is orders of magnitude more efficient in terms of both time and space.
arXiv Detail & Related papers (2020-04-19T09:43:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.