Optical Neural Ordinary Differential Equations
- URL: http://arxiv.org/abs/2209.12898v1
- Date: Mon, 26 Sep 2022 04:04:02 GMT
- Title: Optical Neural Ordinary Differential Equations
- Authors: Yun Zhao, Hang Chen, Min Lin, Haiou Zhang, Tao Yan, Xing Lin, Ruqi
Huang and Qionghai Dai
- Abstract summary: We propose the optical neural ordinary differential equations (ON-ODE) architecture that parameterizes the continuous dynamics of hidden layers with optical ODE solvers.
The ON-ODE comprises the PNNs followed by the photonic integrator and optical feedback loop, which can be configured to represent residual neural networks (ResNet) and recurrent neural networks with effectively reduced chip area occupancy.
- Score: 44.97261923694945
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Increasing the layer number of on-chip photonic neural networks (PNNs) is
essential to improve its model performance. However, the successively cascading
of network hidden layers results in larger integrated photonic chip areas. To
address this issue, we propose the optical neural ordinary differential
equations (ON-ODE) architecture that parameterizes the continuous dynamics of
hidden layers with optical ODE solvers. The ON-ODE comprises the PNNs followed
by the photonic integrator and optical feedback loop, which can be configured
to represent residual neural networks (ResNet) and recurrent neural networks
with effectively reduced chip area occupancy. For the interference-based
optoelectronic nonlinear hidden layer, the numerical experiments demonstrate
that the single hidden layer ON-ODE can achieve approximately the same accuracy
as the two-layer optical ResNet in image classification tasks. Besides, the
ONODE improves the model classification accuracy for the diffraction-based
all-optical linear hidden layer. The time-dependent dynamics property of ON-ODE
is further applied for trajectory prediction with high accuracy.
Related papers
- 1-bit Quantized On-chip Hybrid Diffraction Neural Network Enabled by Authentic All-optical Fully-connected Architecture [4.594367761345624]
This study introduces the Hybrid Diffraction Neural Network (HDNN), a novel architecture that incorporates matrix multiplication into DNNs.
utilizing a singular phase modulation layer and an amplitude modulation layer, the trained neural network demonstrated remarkable accuracies of 96.39% and 89% in digit recognition tasks.
arXiv Detail & Related papers (2024-04-11T02:54:17Z) - Systematic construction of continuous-time neural networks for linear dynamical systems [0.0]
We discuss a systematic approach to constructing neural architectures for modeling a subclass of dynamical systems.
We use a variant of continuous-time neural networks in which the output of each neuron evolves continuously as a solution of a first-order or second-order Ordinary Differential Equation (ODE)
Instead of deriving the network architecture and parameters from data, we propose a gradient-free algorithm to compute sparse architecture and network parameters directly from the given LTI system.
arXiv Detail & Related papers (2024-03-24T16:16:41Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - Physics-aware Differentiable Discrete Codesign for Diffractive Optical
Neural Networks [12.952987240366781]
This work proposes a novel device-to-system hardware-software codesign framework, which enables efficient training of Diffractive optical neural networks (DONNs)
Gumbel-Softmax is employed to enable differentiable discrete mapping from real-world device parameters into the forward function of DONNs.
The results have demonstrated that our proposed framework offers significant advantages over conventional quantization-based methods.
arXiv Detail & Related papers (2022-09-28T17:13:28Z) - All-optical graph representation learning using integrated diffractive
photonic computing units [51.15389025760809]
Photonic neural networks perform brain-inspired computations using photons instead of electrons.
We propose an all-optical graph representation learning architecture, termed diffractive graph neural network (DGNN)
We demonstrate the use of DGNN extracted features for node and graph-level classification tasks with benchmark databases and achieve superior performance.
arXiv Detail & Related papers (2022-04-23T02:29:48Z) - Rate Distortion Characteristic Modeling for Neural Image Compression [59.25700168404325]
End-to-end optimization capability offers neural image compression (NIC) superior lossy compression performance.
distinct models are required to be trained to reach different points in the rate-distortion (R-D) space.
We make efforts to formulate the essential mathematical functions to describe the R-D behavior of NIC using deep network and statistical modeling.
arXiv Detail & Related papers (2021-06-24T12:23:05Z) - Scaling Properties of Deep Residual Networks [2.6763498831034043]
We investigate the properties of weights trained by gradient descent and their scaling with network depth through numerical experiments.
We observe the existence of scaling regimes markedly different from those assumed in neural ODE literature.
These findings cast doubts on the validity of the neural ODE model as an adequate description of deep ResNets.
arXiv Detail & Related papers (2021-05-25T22:31:30Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.