Gradients of unitary optical neural networks using parameter-shift rule
- URL: http://arxiv.org/abs/2506.11565v1
- Date: Fri, 13 Jun 2025 08:21:06 GMT
- Title: Gradients of unitary optical neural networks using parameter-shift rule
- Authors: Jinzhe Jiang, Yaqian Zhao, Xin Zhang, Chen Li, Yunlong Yu, Hailing Liu,
- Abstract summary: This paper explores the application of the parameter-shift rule (PSR) for computing gradients in unitary optical neural networks (UONNs)<n>We demonstrate how PSR, which calculates gradients by evaluating functions at shifted parameter values, can be effectively adapted for training UONNs constructed from Mach-Zehnder interferometer meshes.<n>We present the theoretical framework and practical methodology for applying PSR to optimize phase parameters in optical neural networks, potentially advancing the development of efficient hardware-based training strategies for optical computing systems.
- Score: 14.364214412875494
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper explores the application of the parameter-shift rule (PSR) for computing gradients in unitary optical neural networks (UONNs). While backpropagation has been fundamental to training conventional neural networks, its implementation in optical neural networks faces significant challenges due to the physical constraints of optical systems. We demonstrate how PSR, which calculates gradients by evaluating functions at shifted parameter values, can be effectively adapted for training UONNs constructed from Mach-Zehnder interferometer meshes. The method leverages the inherent Fourier series nature of optical interference in these systems to compute exact analytical gradients directly from hardware measurements. This approach offers a promising alternative to traditional in silico training methods and circumvents the limitations of both finite difference approximations and all-optical backpropagation implementations. We present the theoretical framework and practical methodology for applying PSR to optimize phase parameters in optical neural networks, potentially advancing the development of efficient hardware-based training strategies for optical computing systems.
Related papers
- A Neural Network Training Method Based on Neuron Connection Coefficient Adjustments [2.684545081600664]
We present a new training approach for a neural network based on symmetric differential equations.<n>Unlike the previously introduced method, this approach does not require adjustments to the fixed points of the differential equations.<n>To validate this approach, we tested it on the MNIST dataset and achieved promising results.
arXiv Detail & Related papers (2025-01-25T16:07:30Z) - Randomized Forward Mode Gradient for Spiking Neural Networks in Scientific Machine Learning [4.178826560825283]
Spiking neural networks (SNNs) represent a promising approach in machine learning, combining the hierarchical learning capabilities of deep neural networks with the energy efficiency of spike-based computations.
Traditional end-to-end training of SNNs is often based on back-propagation, where weight updates are derived from gradients computed through the chain rule.
This method encounters challenges due to its limited biological plausibility and inefficiencies on neuromorphic hardware.
In this study, we introduce an alternative training approach for SNNs. Instead of using back-propagation, we leverage weight perturbation methods within a forward-mode
arXiv Detail & Related papers (2024-11-11T15:20:54Z) - Generalizable Non-Line-of-Sight Imaging with Learnable Physical Priors [52.195637608631955]
Non-line-of-sight (NLOS) imaging has attracted increasing attention due to its potential applications.
Existing NLOS reconstruction approaches are constrained by the reliance on empirical physical priors.
We introduce a novel learning-based solution, comprising two key designs: Learnable Path Compensation (LPC) and Adaptive Phasor Field (APF)
arXiv Detail & Related papers (2024-09-21T04:39:45Z) - Contrastive Learning in Memristor-based Neuromorphic Systems [55.11642177631929]
Spiking neural networks have become an important family of neuron-based models that sidestep many of the key limitations facing modern-day backpropagation-trained deep networks.
In this work, we design and investigate a proof-of-concept instantiation of contrastive-signal-dependent plasticity (CSDP), a neuromorphic form of forward-forward-based, backpropagation-free learning.
arXiv Detail & Related papers (2024-09-17T04:48:45Z) - Training Large-Scale Optical Neural Networks with Two-Pass Forward Propagation [0.0]
This paper addresses the limitations in Optical Neural Networks (ONNs) related to training efficiency, nonlinear function implementation, and large input data processing.
We introduce Two-Pass Forward Propagation, a novel training method that avoids specific nonlinear activation functions by modulating and re-entering error with random noise.
We propose a new way to implement convolutional neural networks using simple neural networks in integrated optical systems.
arXiv Detail & Related papers (2024-08-15T11:27:01Z) - Advancing Spatio-Temporal Processing in Spiking Neural Networks through Adaptation [6.233189707488025]
neural networks on neuromorphic hardware promise orders of less power consumption than their non-spiking counterparts.<n>Standard neuron model for spike-based computation on such systems has long been the integrate-and-fire (LIF) neuron.<n>The root of these so-called adaptive LIF neurons is not well understood.
arXiv Detail & Related papers (2024-08-14T12:49:58Z) - Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
Layer-wise Feedback feedback (LFP) is a novel training principle for neural network-like predictors.<n>LFP decomposes a reward to individual neurons based on their respective contributions.<n>Our method then implements a greedy reinforcing approach helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Reparameterization through Spatial Gradient Scaling [69.27487006953852]
Reparameterization aims to improve the generalization of deep neural networks by transforming convolutional layers into equivalent multi-branched structures during training.
We present a novel spatial gradient scaling method to redistribute learning focus among weights in convolutional networks.
arXiv Detail & Related papers (2023-03-05T17:57:33Z) - Physics-aware Differentiable Discrete Codesign for Diffractive Optical
Neural Networks [12.952987240366781]
This work proposes a novel device-to-system hardware-software codesign framework, which enables efficient training of Diffractive optical neural networks (DONNs)
Gumbel-Softmax is employed to enable differentiable discrete mapping from real-world device parameters into the forward function of DONNs.
The results have demonstrated that our proposed framework offers significant advantages over conventional quantization-based methods.
arXiv Detail & Related papers (2022-09-28T17:13:28Z) - Rapid characterisation of linear-optical networks via PhaseLift [51.03305009278831]
Integrated photonics offers great phase-stability and can rely on the large scale manufacturability provided by the semiconductor industry.
New devices, based on such optical circuits, hold the promise of faster and energy-efficient computations in machine learning applications.
We present a novel technique to reconstruct the transfer matrix of linear optical networks.
arXiv Detail & Related papers (2020-10-01T16:04:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.