Monolithic Silicon Photonic Architecture for Training Deep Neural
Networks with Direct Feedback Alignment
- URL: http://arxiv.org/abs/2111.06862v1
- Date: Fri, 12 Nov 2021 18:31:51 GMT
- Title: Monolithic Silicon Photonic Architecture for Training Deep Neural
Networks with Direct Feedback Alignment
- Authors: Matthew J. Filipovich, Zhimu Guo, Mohammed Al-Qadasi, Bicky A.
Marquez, Hugh D. Morison, Volker J. Sorger, Paul R. Prucnal, Sudip Shekhar,
and Bhavin J. Shastri
- Abstract summary: We propose on-chip training of neural networks enabled by a CMOS-compatible silicon photonic architecture.
Our scheme employs the direct feedback alignment training algorithm, which trains neural networks using error feedback rather than error backpropagation.
We experimentally demonstrate training a deep neural network with the MNIST dataset using on-chip MAC operation results.
- Score: 0.6501025489527172
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The field of artificial intelligence (AI) has witnessed tremendous growth in
recent years, however some of the most pressing challenges for the continued
development of AI systems are the fundamental bandwidth, energy efficiency, and
speed limitations faced by electronic computer architectures. There has been
growing interest in using photonic processors for performing neural network
inference operations, however these networks are currently trained using
standard digital electronics. Here, we propose on-chip training of neural
networks enabled by a CMOS-compatible silicon photonic architecture to harness
the potential for massively parallel, efficient, and fast data operations. Our
scheme employs the direct feedback alignment training algorithm, which trains
neural networks using error feedback rather than error backpropagation, and can
operate at speeds of trillions of multiply-accumulate (MAC) operations per
second while consuming less than one picojoule per MAC operation. The photonic
architecture exploits parallelized matrix-vector multiplications using arrays
of microring resonators for processing multi-channel analog signals along
single waveguide buses to calculate the gradient vector of each neural network
layer in situ, which is the most computationally expensive operation performed
during the backward pass. We also experimentally demonstrate training a deep
neural network with the MNIST dataset using on-chip MAC operation results. Our
novel approach for efficient, ultra-fast neural network training showcases
photonics as a promising platform for executing AI applications.
Related papers
- Neural Network Methods for Radiation Detectors and Imaging [1.6395318070400589]
Recent advances in machine learning and especially deep neural networks (DNNs) allow for new optimization and performance-enhancement schemes for radiation detectors and imaging hardware.
We give an overview of data generation at photon sources, deep learning-based methods for image processing tasks, and hardware solutions for deep learning acceleration.
arXiv Detail & Related papers (2023-11-09T20:21:51Z) - Neuromorphic analog circuits for robust on-chip always-on learning in
spiking neural networks [1.9809266426888898]
Mixed-signal neuromorphic systems represent a promising solution for solving extreme-edge computing tasks.
Their spiking neural network circuits are optimized for processing sensory data on-line in continuous-time.
We design on-chip learning circuits with short-term analog dynamics and long-term tristate discretization mechanisms.
arXiv Detail & Related papers (2023-07-12T11:14:25Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Training Spiking Neural Networks with Local Tandem Learning [96.32026780517097]
Spiking neural networks (SNNs) are shown to be more biologically plausible and energy efficient than their predecessors.
In this paper, we put forward a generalized learning rule, termed Local Tandem Learning (LTL)
We demonstrate rapid network convergence within five training epochs on the CIFAR-10 dataset while having low computational complexity.
arXiv Detail & Related papers (2022-10-10T10:05:00Z) - Scalable Nanophotonic-Electronic Spiking Neural Networks [3.9918594409417576]
Spiking neural networks (SNN) provide a new computational paradigm capable of highly parallelized, real-time processing.
Photonic devices are ideal for the design of high-bandwidth, parallel architectures matching the SNN computational paradigm.
Co-integrated CMOS and SiPh technologies are well-suited to the design of scalable SNN computing architectures.
arXiv Detail & Related papers (2022-08-28T06:10:06Z) - Experimentally realized in situ backpropagation for deep learning in
nanophotonic neural networks [0.7627023515997987]
We design mass-manufacturable silicon photonic neural networks that cascade our custom designed "photonic mesh" accelerator.
We demonstrate in situ backpropagation for the first time to solve classification tasks.
Our findings suggest a new training paradigm for photonics-accelerated artificial intelligence based entirely on a physical analog of the popular backpropagation technique.
arXiv Detail & Related papers (2022-05-17T17:13:50Z) - Large-scale neuromorphic optoelectronic computing with a reconfigurable
diffractive processing unit [38.898230519968116]
We propose an optoelectronic reconfigurable computing paradigm by constructing a diffractive processing unit.
It can efficiently support different neural networks and achieve a high model complexity with millions of neurons.
Our prototype system built with off-the-shelf optoelectronic components surpasses the performance of state-of-the-art graphics processing units.
arXiv Detail & Related papers (2020-08-26T16:34:58Z) - Training End-to-End Analog Neural Networks with Equilibrium Propagation [64.0476282000118]
We introduce a principled method to train end-to-end analog neural networks by gradient descent.
We show mathematically that a class of analog neural networks (called nonlinear resistive networks) are energy-based models.
Our work can guide the development of a new generation of ultra-fast, compact and low-power neural networks supporting on-chip learning.
arXiv Detail & Related papers (2020-06-02T23:38:35Z) - One-step regression and classification with crosspoint resistive memory
arrays [62.997667081978825]
High speed, low energy computing machines are in demand to enable real-time artificial intelligence at the edge.
One-step learning is supported by simulations of the prediction of the cost of a house in Boston and the training of a 2-layer neural network for MNIST digit recognition.
Results are all obtained in one computational step, thanks to the physical, parallel, and analog computing within the crosspoint array.
arXiv Detail & Related papers (2020-05-05T08:00:07Z) - Spiking Neural Networks Hardware Implementations and Challenges: a
Survey [53.429871539789445]
Spiking Neural Networks are cognitive algorithms mimicking neuron and synapse operational principles.
We present the state of the art of hardware implementations of spiking neural networks.
We discuss the strategies employed to leverage the characteristics of these event-driven algorithms at the hardware level.
arXiv Detail & Related papers (2020-05-04T13:24:00Z) - Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G
Networks [84.2155885234293]
We first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC.
To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC.
arXiv Detail & Related papers (2020-02-22T14:38:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.