Single-Shot Optical Neural Network
- URL: http://arxiv.org/abs/2205.09103v1
- Date: Wed, 18 May 2022 17:49:49 GMT
- Title: Single-Shot Optical Neural Network
- Authors: Liane Bernstein, Alexander Sludds, Christopher Panuski, Sivan
Trajtenberg-Mills, Ryan Hamerly, Dirk Englund
- Abstract summary: 'Weight-stationary' analog optical and electronic hardware has been proposed to reduce the compute resources required by deep neural networks.
We present a scalable, single-shot-per-layer weight-stationary optical processor.
- Score: 55.41644538483948
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As deep neural networks (DNNs) grow to solve increasingly complex problems,
they are becoming limited by the latency and power consumption of existing
digital processors. 'Weight-stationary' analog optical and electronic hardware
has been proposed to reduce the compute resources required by DNNs by
eliminating expensive weight updates; however, with scalability limited to an
input vector length $K$ of hundreds of elements. Here, we present a scalable,
single-shot-per-layer weight-stationary optical processor that leverages the
advantages of free-space optics for passive optical copying and large-scale
distribution of an input vector and integrated optoelectronics for static,
reconfigurable weighting and the nonlinearity. We propose an optimized
near-term CMOS-compatible system with $K = 1,000$ and beyond, and we calculate
its theoretical total latency ($\sim$10 ns), energy consumption ($\sim$10
fJ/MAC) and throughput ($\sim$petaMAC/s) per layer. We also experimentally test
DNN classification accuracy with single-shot analog optical encoding, copying
and weighting of the MNIST handwritten digit dataset in a proof-of-concept
system, achieving 94.7% (similar to the ground truth accuracy of 96.3%) without
retraining on the hardware or data preprocessing. Lastly, we determine the
upper bound on throughput of our system ($\sim$0.9 exaMAC/s), set by the
maximum optical bandwidth before significant loss of accuracy. This joint use
of wide spectral and spatial bandwidths enables highly efficient computing for
next-generation DNNs.
Related papers
- An Event-Based Digital Compute-In-Memory Accelerator with Flexible Operand Resolution and Layer-Wise Weight/Output Stationarity [0.11522790873450185]
CIM accelerators for spiking neural networks (SNNs) are promising solutions to enable $mu$s-level inference latency and ultra-low energy in edge vision applications.
We propose a novel digital CIM macro that supports arbitrary operand resolution and shape, with a unified CIM storage for weights and membrane potentials.
Our approach can save up to 90% energy in large-scale systems, while reaching a state-of-the-art classification accuracy of 95.8% on the IBM DVS gesture dataset.
arXiv Detail & Related papers (2024-10-30T14:55:13Z) - Digital-analog hybrid matrix multiplication processor for optical neural
networks [11.171425574890765]
We propose a digital-analog hybrid optical computing architecture for optical neural networks (ONNs)
By introducing the logic levels and decisions based on thresholding, the calculation precision can be significantly enhanced.
We have demonstrated an unprecedented 16-bit calculation precision for high-definition image processing, with a pixel error rate (PER) as low as $1.8times10-3$ at a signal-to-noise ratio (SNR) of 18.2 dB.
arXiv Detail & Related papers (2024-01-26T18:42:57Z) - Real-Time FJ/MAC PDE Solvers via Tensorized, Back-Propagation-Free
Optical PINN Training [5.809283001227614]
This paper develops an on-chip training framework for physics-informed neural networks (PINNs)
It aims to solve high-dimensional PDEs with fJ/MAC photonic power consumption and ultra-low latency.
This is the first real-size optical PINN training framework that can be applied to solve high-dimensional PDEs.
arXiv Detail & Related papers (2023-12-31T07:10:15Z) - Hardware-Aware DNN Compression via Diverse Pruning and Mixed-Precision
Quantization [1.0235078178220354]
We propose an automated framework to compress Deep Neural Networks (DNNs) in a hardware-aware manner by jointly employing pruning and quantization.
Our framework achieves $39%$ average energy reduction for datasets $1.7%$ average accuracy loss and outperforms significantly the state-of-the-art approaches.
arXiv Detail & Related papers (2023-12-23T18:50:13Z) - Pruning random resistive memory for optimizing analogue AI [54.21621702814583]
AI models present unprecedented challenges to energy consumption and environmental sustainability.
One promising solution is to revisit analogue computing, a technique that predates digital computing.
Here, we report a universal solution, software-hardware co-design using structural plasticity-inspired edge pruning.
arXiv Detail & Related papers (2023-11-13T08:59:01Z) - Energy-Efficient On-Board Radio Resource Management for Satellite
Communications via Neuromorphic Computing [59.40731173370976]
We investigate the application of energy-efficient brain-inspired machine learning models for on-board radio resource management.
For relevant workloads, spiking neural networks (SNNs) implemented on Loihi 2 yield higher accuracy, while reducing power consumption by more than 100$times$ as compared to the CNN-based reference platform.
arXiv Detail & Related papers (2023-08-22T03:13:57Z) - RF-Photonic Deep Learning Processor with Shannon-Limited Data Movement [0.0]
Optical neural networks (ONNs) are promising accelerators with ultra-low latency and energy consumption.
We introduce our multiplicative analog frequency transform ONN (MAFT-ONN) that encodes the data in the frequency domain.
We experimentally demonstrate the first hardware accelerator that computes fully-analog deep learning on raw RF signals.
arXiv Detail & Related papers (2022-07-08T16:37:13Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - A quantum algorithm for training wide and deep classical neural networks [72.2614468437919]
We show that conditions amenable to classical trainability via gradient descent coincide with those necessary for efficiently solving quantum linear systems.
We numerically demonstrate that the MNIST image dataset satisfies such conditions.
We provide empirical evidence for $O(log n)$ training of a convolutional neural network with pooling.
arXiv Detail & Related papers (2021-07-19T23:41:03Z) - Low-Precision Training in Logarithmic Number System using Multiplicative
Weight Update [49.948082497688404]
Training large-scale deep neural networks (DNNs) currently requires a significant amount of energy, leading to serious environmental impacts.
One promising approach to reduce the energy costs is representing DNNs with low-precision numbers.
We jointly design a lowprecision training framework involving a logarithmic number system (LNS) and a multiplicative weight update training method, termed LNS-Madam.
arXiv Detail & Related papers (2021-06-26T00:32:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.