Monolithic Silicon Photonic Architecture for Training Deep Neural
Networks with Direct Feedback Alignment
- URL: http://arxiv.org/abs/2111.06862v1
- Date: Fri, 12 Nov 2021 18:31:51 GMT
- Title: Monolithic Silicon Photonic Architecture for Training Deep Neural
Networks with Direct Feedback Alignment
- Authors: Matthew J. Filipovich, Zhimu Guo, Mohammed Al-Qadasi, Bicky A.
Marquez, Hugh D. Morison, Volker J. Sorger, Paul R. Prucnal, Sudip Shekhar,
and Bhavin J. Shastri
- Abstract summary: We propose on-chip training of neural networks enabled by a CMOS-compatible silicon photonic architecture.
Our scheme employs the direct feedback alignment training algorithm, which trains neural networks using error feedback rather than error backpropagation.
We experimentally demonstrate training a deep neural network with the MNIST dataset using on-chip MAC operation results.
- Score: 0.6501025489527172
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The field of artificial intelligence (AI) has witnessed tremendous growth in
recent years, however some of the most pressing challenges for the continued
development of AI systems are the fundamental bandwidth, energy efficiency, and
speed limitations faced by electronic computer architectures. There has been
growing interest in using photonic processors for performing neural network
inference operations, however these networks are currently trained using
standard digital electronics. Here, we propose on-chip training of neural
networks enabled by a CMOS-compatible silicon photonic architecture to harness
the potential for massively parallel, efficient, and fast data operations. Our
scheme employs the direct feedback alignment training algorithm, which trains
neural networks using error feedback rather than error backpropagation, and can
operate at speeds of trillions of multiply-accumulate (MAC) operations per
second while consuming less than one picojoule per MAC operation. The photonic
architecture exploits parallelized matrix-vector multiplications using arrays
of microring resonators for processing multi-channel analog signals along
single waveguide buses to calculate the gradient vector of each neural network
layer in situ, which is the most computationally expensive operation performed
during the backward pass. We also experimentally demonstrate training a deep
neural network with the MNIST dataset using on-chip MAC operation results. Our
novel approach for efficient, ultra-fast neural network training showcases
photonics as a promising platform for executing AI applications.
Related papers
- Optical training of large-scale Transformers and deep neural networks with direct feedback alignment [48.90869997343841]
We experimentally implement a versatile and scalable training algorithm, called direct feedback alignment, on a hybrid electronic-photonic platform.
An optical processing unit performs large-scale random matrix multiplications, which is the central operation of this algorithm, at speeds up to 1500 TeraOps.
We study the compute scaling of our hybrid optical approach, and demonstrate a potential advantage for ultra-deep and wide neural networks.
arXiv Detail & Related papers (2024-09-01T12:48:47Z) - Neural Network Methods for Radiation Detectors and Imaging [1.6395318070400589]
Recent advances in machine learning and especially deep neural networks (DNNs) allow for new optimization and performance-enhancement schemes for radiation detectors and imaging hardware.
We give an overview of data generation at photon sources, deep learning-based methods for image processing tasks, and hardware solutions for deep learning acceleration.
arXiv Detail & Related papers (2023-11-09T20:21:51Z) - Neuromorphic analog circuits for robust on-chip always-on learning in
spiking neural networks [1.9809266426888898]
Mixed-signal neuromorphic systems represent a promising solution for solving extreme-edge computing tasks.
Their spiking neural network circuits are optimized for processing sensory data on-line in continuous-time.
We design on-chip learning circuits with short-term analog dynamics and long-term tristate discretization mechanisms.
arXiv Detail & Related papers (2023-07-12T11:14:25Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Training Spiking Neural Networks with Local Tandem Learning [96.32026780517097]
Spiking neural networks (SNNs) are shown to be more biologically plausible and energy efficient than their predecessors.
In this paper, we put forward a generalized learning rule, termed Local Tandem Learning (LTL)
We demonstrate rapid network convergence within five training epochs on the CIFAR-10 dataset while having low computational complexity.
arXiv Detail & Related papers (2022-10-10T10:05:00Z) - Scalable Nanophotonic-Electronic Spiking Neural Networks [3.9918594409417576]
Spiking neural networks (SNN) provide a new computational paradigm capable of highly parallelized, real-time processing.
Photonic devices are ideal for the design of high-bandwidth, parallel architectures matching the SNN computational paradigm.
Co-integrated CMOS and SiPh technologies are well-suited to the design of scalable SNN computing architectures.
arXiv Detail & Related papers (2022-08-28T06:10:06Z) - Large-scale neuromorphic optoelectronic computing with a reconfigurable
diffractive processing unit [38.898230519968116]
We propose an optoelectronic reconfigurable computing paradigm by constructing a diffractive processing unit.
It can efficiently support different neural networks and achieve a high model complexity with millions of neurons.
Our prototype system built with off-the-shelf optoelectronic components surpasses the performance of state-of-the-art graphics processing units.
arXiv Detail & Related papers (2020-08-26T16:34:58Z) - Training End-to-End Analog Neural Networks with Equilibrium Propagation [64.0476282000118]
We introduce a principled method to train end-to-end analog neural networks by gradient descent.
We show mathematically that a class of analog neural networks (called nonlinear resistive networks) are energy-based models.
Our work can guide the development of a new generation of ultra-fast, compact and low-power neural networks supporting on-chip learning.
arXiv Detail & Related papers (2020-06-02T23:38:35Z) - One-step regression and classification with crosspoint resistive memory
arrays [62.997667081978825]
High speed, low energy computing machines are in demand to enable real-time artificial intelligence at the edge.
One-step learning is supported by simulations of the prediction of the cost of a house in Boston and the training of a 2-layer neural network for MNIST digit recognition.
Results are all obtained in one computational step, thanks to the physical, parallel, and analog computing within the crosspoint array.
arXiv Detail & Related papers (2020-05-05T08:00:07Z) - Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G
Networks [84.2155885234293]
We first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC.
To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC.
arXiv Detail & Related papers (2020-02-22T14:38:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.