SupeRBNN: Randomized Binary Neural Network Using Adiabatic
Superconductor Josephson Devices
- URL: http://arxiv.org/abs/2309.12212v1
- Date: Thu, 21 Sep 2023 16:14:42 GMT
- Title: SupeRBNN: Randomized Binary Neural Network Using Adiabatic
Superconductor Josephson Devices
- Authors: Zhengang Li, Geng Yuan, Tomoharu Yamauchi, Zabihi Masoud, Yanyue Xie,
Peiyan Dong, Xulong Tang, Nobuyuki Yoshikawa, Devesh Tiwari, Yanzhi Wang,
Olivia Chen
- Abstract summary: AQFP devices serve as excellent carriers for binary neural network (BNN) computations.
We propose SupeRBNN, an AQFP-based randomized BNN acceleration framework.
We show that our design achieves an energy efficiency of approximately 7.8x104 times higher than that of the ReRAM-based BNN framework.
- Score: 44.440915387556544
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adiabatic Quantum-Flux-Parametron (AQFP) is a superconducting logic with
extremely high energy efficiency. By employing the distinct polarity of current
to denote logic `0' and `1', AQFP devices serve as excellent carriers for
binary neural network (BNN) computations. Although recent research has made
initial strides toward developing an AQFP-based BNN accelerator, several
critical challenges remain, preventing the design from being a comprehensive
solution. In this paper, we propose SupeRBNN, an AQFP-based randomized BNN
acceleration framework that leverages software-hardware co-optimization to
eventually make the AQFP devices a feasible solution for BNN acceleration.
Specifically, we investigate the randomized behavior of the AQFP devices and
analyze the impact of crossbar size on current attenuation, subsequently
formulating the current amplitude into the values suitable for use in BNN
computation. To tackle the accumulation problem and improve overall hardware
performance, we propose a stochastic computing-based accumulation module and a
clocking scheme adjustment-based circuit optimization method. We validate our
SupeRBNN framework across various datasets and network architectures, comparing
it with implementations based on different technologies, including CMOS, ReRAM,
and superconducting RSFQ/ERSFQ. Experimental results demonstrate that our
design achieves an energy efficiency of approximately 7.8x10^4 times higher
than that of the ReRAM-based BNN framework while maintaining a similar level of
model accuracy. Furthermore, when compared with superconductor-based
counterparts, our framework demonstrates at least two orders of magnitude
higher energy efficiency.
Related papers
- Neuromorphic Wireless Split Computing with Multi-Level Spikes [69.73249913506042]
In neuromorphic computing, spiking neural networks (SNNs) perform inference tasks, offering significant efficiency gains for workloads involving sequential data.
Recent advances in hardware and software have demonstrated that embedding a few bits of payload in each spike exchanged between the spiking neurons can further enhance inference accuracy.
This paper investigates a wireless neuromorphic split computing architecture employing multi-level SNNs.
arXiv Detail & Related papers (2024-11-07T14:08:35Z) - Enhancing Dropout-based Bayesian Neural Networks with Multi-Exit on FPGA [20.629635991749808]
This paper proposes an algorithm and hardware co-design framework that can generate field-programmable gate array (FPGA)-based accelerators for efficient BayesNNs.
At the algorithm level, we propose novel multi-exit dropout-based BayesNNs with reduced computational and memory overheads.
At the hardware level, this paper introduces a transformation framework that can generate FPGA-based accelerators for the proposed efficient BayesNNs.
arXiv Detail & Related papers (2024-06-20T17:08:42Z) - Energy-Efficient On-Board Radio Resource Management for Satellite
Communications via Neuromorphic Computing [59.40731173370976]
We investigate the application of energy-efficient brain-inspired machine learning models for on-board radio resource management.
For relevant workloads, spiking neural networks (SNNs) implemented on Loihi 2 yield higher accuracy, while reducing power consumption by more than 100$times$ as compared to the CNN-based reference platform.
arXiv Detail & Related papers (2023-08-22T03:13:57Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge
Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC)
We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer.
Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z) - High-Performance FPGA-based Accelerator for Bayesian Recurrent Neural
Networks [2.0631735969348064]
We propose an FPGA-based hardware design to accelerate Bayesian LSTM-based RNNs.
Compared with GPU implementation, our FPGA-based design can achieve up to 10 times speedup with nearly 106 times higher energy efficiency.
arXiv Detail & Related papers (2021-06-04T14:30:39Z) - High-Performance FPGA-based Accelerator for Bayesian Neural Networks [5.86877988129171]
This work proposes a novel FPGA-based hardware architecture to accelerate BNNs inferred through Monte Carlo Dropout.
Compared with other state-of-the-art BNN accelerators, the proposed accelerator can achieve up to 4 times higher energy efficiency and 9 times better compute efficiency.
arXiv Detail & Related papers (2021-05-12T06:20:44Z) - NullaNet Tiny: Ultra-low-latency DNN Inference Through Fixed-function
Combinational Logic [4.119948826527649]
Field-programmable gate array (FPGA)-based accelerators are gaining traction as a serious contender to replace graphics processing unit/central processing unit-based platforms.
This paper presents NullaNet Tiny, a framework for constructing resource and energy-efficient, ultra-low-latency FPGA-based neural network accelerators.
arXiv Detail & Related papers (2021-04-07T00:16:39Z) - Learning to Solve the AC-OPF using Sensitivity-Informed Deep Neural
Networks [52.32646357164739]
We propose a deep neural network (DNN) to solve the solutions of the optimal power flow (ACOPF)
The proposed SIDNN is compatible with a broad range of OPF schemes.
It can be seamlessly integrated in other learning-to-OPF schemes.
arXiv Detail & Related papers (2021-03-27T00:45:23Z) - Fully-parallel Convolutional Neural Network Hardware [0.7829352305480285]
We propose a new power-and-area-efficient architecture for implementing Articial Neural Networks (ANNs) in hardware.
For the first time, a fully-parallel CNN as LENET-5 is embedded and tested in a single FPGA.
arXiv Detail & Related papers (2020-06-22T17:19:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.