Training neural networks with end-to-end optical backpropagation
- URL: http://arxiv.org/abs/2308.05226v1
- Date: Wed, 9 Aug 2023 21:11:26 GMT
- Title: Training neural networks with end-to-end optical backpropagation
- Authors: James Spall, Xianxin Guo, A. I. Lvovsky
- Abstract summary: We show how to implement backpropagation, an algorithm for training a neural network, using optical processes.
Our approach is adaptable to various analog platforms, materials, and network structures.
It demonstrates the possibility of constructing neural networks entirely reliant on analog optical processes for both training and inference tasks.
- Score: 1.1602089225841632
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Optics is an exciting route for the next generation of computing hardware for
machine learning, promising several orders of magnitude enhancement in both
computational speed and energy efficiency. However, to reach the full capacity
of an optical neural network it is necessary that the computing not only for
the inference, but also for the training be implemented optically. The primary
algorithm for training a neural network is backpropagation, in which the
calculation is performed in the order opposite to the information flow for
inference. While straightforward in a digital computer, optical implementation
of backpropagation has so far remained elusive, particularly because of the
conflicting requirements for the optical element that implements the nonlinear
activation function. In this work, we address this challenge for the first time
with a surprisingly simple and generic scheme. Saturable absorbers are employed
for the role of the activation units, and the required properties are achieved
through a pump-probe process, in which the forward propagating signal acts as
the pump and backward as the probe. Our approach is adaptable to various analog
platforms, materials, and network structures, and it demonstrates the
possibility of constructing neural networks entirely reliant on analog optical
processes for both training and inference tasks.
Related papers
- Contrastive Learning in Memristor-based Neuromorphic Systems [55.11642177631929]
Spiking neural networks have become an important family of neuron-based models that sidestep many of the key limitations facing modern-day backpropagation-trained deep networks.
In this work, we design and investigate a proof-of-concept instantiation of contrastive-signal-dependent plasticity (CSDP), a neuromorphic form of forward-forward-based, backpropagation-free learning.
arXiv Detail & Related papers (2024-09-17T04:48:45Z) - Optical training of large-scale Transformers and deep neural networks with direct feedback alignment [48.90869997343841]
We experimentally implement a versatile and scalable training algorithm, called direct feedback alignment, on a hybrid electronic-photonic platform.
An optical processing unit performs large-scale random matrix multiplications, which is the central operation of this algorithm, at speeds up to 1500 TeraOps.
We study the compute scaling of our hybrid optical approach, and demonstrate a potential advantage for ultra-deep and wide neural networks.
arXiv Detail & Related papers (2024-09-01T12:48:47Z) - Training Large-Scale Optical Neural Networks with Two-Pass Forward Propagation [0.0]
This paper addresses the limitations in Optical Neural Networks (ONNs) related to training efficiency, nonlinear function implementation, and large input data processing.
We introduce Two-Pass Forward Propagation, a novel training method that avoids specific nonlinear activation functions by modulating and re-entering error with random noise.
We propose a new way to implement convolutional neural networks using simple neural networks in integrated optical systems.
arXiv Detail & Related papers (2024-08-15T11:27:01Z) - Genetically programmable optical random neural networks [0.0]
We demonstrate a genetically programmable yet simple optical neural network to achieve high performances with optical random projection.
By genetically programming the orientation of the scattering medium which acts as a random projection kernel, our novel technique finds an optimum kernel and improves its initial test accuracies 7-22%.
Our optical computing method presents a promising approach to achieve high performance in optical neural networks with a simple and scalable design.
arXiv Detail & Related papers (2024-03-19T06:55:59Z) - Simple initialization and parametrization of sinusoidal networks via
their kernel bandwidth [92.25666446274188]
sinusoidal neural networks with activations have been proposed as an alternative to networks with traditional activation functions.
We first propose a simplified version of such sinusoidal neural networks, which allows both for easier practical implementation and simpler theoretical analysis.
We then analyze the behavior of these networks from the neural tangent kernel perspective and demonstrate that their kernel approximates a low-pass filter with an adjustable bandwidth.
arXiv Detail & Related papers (2022-11-26T07:41:48Z) - Scale-, shift- and rotation-invariant diffractive optical networks [0.0]
Diffractive Deep Neural Networks (D2NNs) harness light-matter interaction over a series of trainable surfaces to compute a desired statistical inference task.
Here, we demonstrate a new training strategy for diffractive networks that introduces input object translation, rotation and/or scaling during the training phase.
This training strategy successfully guides the evolution of the diffractive optical network design towards a solution that is scale-, shift- and rotation-invariant.
arXiv Detail & Related papers (2020-10-24T02:18:39Z) - Rapid characterisation of linear-optical networks via PhaseLift [51.03305009278831]
Integrated photonics offers great phase-stability and can rely on the large scale manufacturability provided by the semiconductor industry.
New devices, based on such optical circuits, hold the promise of faster and energy-efficient computations in machine learning applications.
We present a novel technique to reconstruct the transfer matrix of linear optical networks.
arXiv Detail & Related papers (2020-10-01T16:04:22Z) - Training End-to-End Analog Neural Networks with Equilibrium Propagation [64.0476282000118]
We introduce a principled method to train end-to-end analog neural networks by gradient descent.
We show mathematically that a class of analog neural networks (called nonlinear resistive networks) are energy-based models.
Our work can guide the development of a new generation of ultra-fast, compact and low-power neural networks supporting on-chip learning.
arXiv Detail & Related papers (2020-06-02T23:38:35Z) - Light-in-the-loop: using a photonics co-processor for scalable training
of neural networks [21.153688679957337]
We present the first optical co-processor able to accelerate the training phase of digitally-implemented neural networks.
We demonstrate its use to train a neural network for handwritten digits recognition.
arXiv Detail & Related papers (2020-06-02T09:19:45Z) - Spiking Neural Networks Hardware Implementations and Challenges: a
Survey [53.429871539789445]
Spiking Neural Networks are cognitive algorithms mimicking neuron and synapse operational principles.
We present the state of the art of hardware implementations of spiking neural networks.
We discuss the strategies employed to leverage the characteristics of these event-driven algorithms at the hardware level.
arXiv Detail & Related papers (2020-05-04T13:24:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.