Experimentally realized in situ backpropagation for deep learning in
nanophotonic neural networks
- URL: http://arxiv.org/abs/2205.08501v1
- Date: Tue, 17 May 2022 17:13:50 GMT
- Title: Experimentally realized in situ backpropagation for deep learning in
nanophotonic neural networks
- Authors: Sunil Pai, Zhanghao Sun, Tyler W. Hughes, Taewon Park, Ben Bartlett,
Ian A. D. Williamson, Momchil Minkov, Maziyar Milanizadeh, Nathnael Abebe,
Francesco Morichetti, Andrea Melloni, Shanhui Fan, Olav Solgaard, David A.B.
Miller
- Abstract summary: We design mass-manufacturable silicon photonic neural networks that cascade our custom designed "photonic mesh" accelerator.
We demonstrate in situ backpropagation for the first time to solve classification tasks.
Our findings suggest a new training paradigm for photonics-accelerated artificial intelligence based entirely on a physical analog of the popular backpropagation technique.
- Score: 0.7627023515997987
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural networks are widely deployed models across many scientific disciplines
and commercial endeavors ranging from edge computing and sensing to large-scale
signal processing in data centers. The most efficient and well-entrenched
method to train such networks is backpropagation, or reverse-mode automatic
differentiation. To counter an exponentially increasing energy budget in the
artificial intelligence sector, there has been recent interest in analog
implementations of neural networks, specifically nanophotonic neural networks
for which no analog backpropagation demonstration exists. We design
mass-manufacturable silicon photonic neural networks that alternately cascade
our custom designed "photonic mesh" accelerator with digitally implemented
nonlinearities. These reconfigurable photonic meshes program computationally
intensive arbitrary matrix multiplication by setting physical voltages that
tune the interference of optically encoded input data propagating through
integrated Mach-Zehnder interferometer networks. Here, using our packaged
photonic chip, we demonstrate in situ backpropagation for the first time to
solve classification tasks and evaluate a new protocol to keep the entire
gradient measurement and update of physical device voltages in the analog
domain, improving on past theoretical proposals. Our method is made possible by
introducing three changes to typical photonic meshes: (1) measurements at
optical "grating tap" monitors, (2) bidirectional optical signal propagation
automated by fiber switch, and (3) universal generation and readout of optical
amplitude and phase. After training, our classification achieves accuracies
similar to digital equivalents even in presence of systematic error. Our
findings suggest a new training paradigm for photonics-accelerated artificial
intelligence based entirely on a physical analog of the popular backpropagation
technique.
Related papers
- Optical training of large-scale Transformers and deep neural networks with direct feedback alignment [48.90869997343841]
We experimentally implement a versatile and scalable training algorithm, called direct feedback alignment, on a hybrid electronic-photonic platform.
An optical processing unit performs large-scale random matrix multiplications, which is the central operation of this algorithm, at speeds up to 1500 TeraOps.
We study the compute scaling of our hybrid optical approach, and demonstrate a potential advantage for ultra-deep and wide neural networks.
arXiv Detail & Related papers (2024-09-01T12:48:47Z) - Integration of Programmable Diffraction with Digital Neural Networks [0.0]
Recently advances in deep learning and digital neural networks have led to efforts to establish diffractive processors that are jointly optimized with digital neural networks serving as their back-end.
This article highlights the utility of this exciting collaboration between engineered and programmed diffraction and digital neural networks for a diverse range of applications.
arXiv Detail & Related papers (2024-06-15T16:49:53Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Spatially Varying Nanophotonic Neural Networks [39.1303097259564]
Photonic processors that execute operations using photons instead of electrons promise to enable optical neural networks with ultra-low latency and power consumption.
Existing optical neural networks, limited by the underlying network designs, have achieved image recognition accuracy far below that of state-of-the-art electronic neural networks.
arXiv Detail & Related papers (2023-08-07T08:48:46Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Scale-, shift- and rotation-invariant diffractive optical networks [0.0]
Diffractive Deep Neural Networks (D2NNs) harness light-matter interaction over a series of trainable surfaces to compute a desired statistical inference task.
Here, we demonstrate a new training strategy for diffractive networks that introduces input object translation, rotation and/or scaling during the training phase.
This training strategy successfully guides the evolution of the diffractive optical network design towards a solution that is scale-, shift- and rotation-invariant.
arXiv Detail & Related papers (2020-10-24T02:18:39Z) - Rapid characterisation of linear-optical networks via PhaseLift [51.03305009278831]
Integrated photonics offers great phase-stability and can rely on the large scale manufacturability provided by the semiconductor industry.
New devices, based on such optical circuits, hold the promise of faster and energy-efficient computations in machine learning applications.
We present a novel technique to reconstruct the transfer matrix of linear optical networks.
arXiv Detail & Related papers (2020-10-01T16:04:22Z) - Deep neural networks for the evaluation and design of photonic devices [0.0]
Review: How deep neural networks can learn from training sets and operate as high-speed surrogate electromagnetic solvers.
Fundamental data sciences framed within the context of photonics will also be discussed.
arXiv Detail & Related papers (2020-06-30T19:52:54Z) - Training End-to-End Analog Neural Networks with Equilibrium Propagation [64.0476282000118]
We introduce a principled method to train end-to-end analog neural networks by gradient descent.
We show mathematically that a class of analog neural networks (called nonlinear resistive networks) are energy-based models.
Our work can guide the development of a new generation of ultra-fast, compact and low-power neural networks supporting on-chip learning.
arXiv Detail & Related papers (2020-06-02T23:38:35Z) - Light-in-the-loop: using a photonics co-processor for scalable training
of neural networks [21.153688679957337]
We present the first optical co-processor able to accelerate the training phase of digitally-implemented neural networks.
We demonstrate its use to train a neural network for handwritten digits recognition.
arXiv Detail & Related papers (2020-06-02T09:19:45Z) - Compressive sensing with un-trained neural networks: Gradient descent
finds the smoothest approximation [60.80172153614544]
Un-trained convolutional neural networks have emerged as highly successful tools for image recovery and restoration.
We show that an un-trained convolutional neural network can approximately reconstruct signals and images that are sufficiently structured, from a near minimal number of random measurements.
arXiv Detail & Related papers (2020-05-07T15:57:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.