Dual adaptive training of photonic neural networks
- URL: http://arxiv.org/abs/2212.06141v1
- Date: Fri, 9 Dec 2022 05:03:45 GMT
- Title: Dual adaptive training of photonic neural networks
- Authors: Ziyang Zheng, Zhengyang Duan, Hang Chen, Rui Yang, Sheng Gao, Haiou
Zhang, Hongkai Xiong, Xing Lin
- Abstract summary: Photonic neural network (PNN) computes with photons instead of electrons to feature low latency, high energy efficiency, and high parallelism.
Existing training approaches cannot address the extensive accumulation of systematic errors in large-scale PNNs.
We propose dual adaptive training ( DAT) that allows the PNN model to adapt to substantial systematic errors.
- Score: 30.86507809437016
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Photonic neural network (PNN) is a remarkable analog artificial intelligence
(AI) accelerator that computes with photons instead of electrons to feature low
latency, high energy efficiency, and high parallelism. However, the existing
training approaches cannot address the extensive accumulation of systematic
errors in large-scale PNNs, resulting in a significant decrease in model
performance in physical systems. Here, we propose dual adaptive training (DAT)
that allows the PNN model to adapt to substantial systematic errors and
preserves its performance during the deployment. By introducing the systematic
error prediction networks with task-similarity joint optimization, DAT achieves
the high similarity mapping between the PNN numerical models and physical
systems and high-accurate gradient calculations during the dual backpropagation
training. We validated the effectiveness of DAT by using diffractive PNNs and
interference-based PNNs on image classification tasks. DAT successfully trained
large-scale PNNs under major systematic errors and preserved the model
classification accuracies comparable to error-free systems. The results further
demonstrated its superior performance over the state-of-the-art in situ
training approaches. DAT provides critical support for constructing large-scale
PNNs to achieve advanced architectures and can be generalized to other types of
AI systems with analog computing errors.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Comprehensive Online Training and Deployment for Spiking Neural Networks [40.255762156745405]
Spiking Neural Networks (SNNs) are considered to have enormous potential in the future development of Artificial Intelligence (AI)
The current proposed online training methods cannot tackle the inseparability problem of temporal dependent gradients.
We propose Efficient Multi-Precision Firing (EM-PF) model, which is a family of advanced spiking models based on floating-point spikes and binary synaptic weights.
arXiv Detail & Related papers (2024-10-10T02:39:22Z) - Asymmetrical estimator for training encapsulated deep photonic neural networks [10.709758849326061]
Photonic neural networks (PNNs) are fast in-propagation and high bandwidth paradigms.
The device-to-device and system-to-system variations create imperfect knowledge of the PNN.
We introduce the asymmetrical training (AT) method, tailored for encapsulated DPNNs.
arXiv Detail & Related papers (2024-05-28T17:27:20Z) - Analyzing and Improving the Training Dynamics of Diffusion Models [36.37845647984578]
We identify and rectify several causes for uneven and ineffective training in the popular ADM diffusion model architecture.
We find that systematic application of this philosophy eliminates the observed drifts and imbalances, resulting in considerably better networks at equal computational complexity.
arXiv Detail & Related papers (2023-12-05T11:55:47Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Physics guided neural networks for modelling of non-linear dynamics [0.0]
This work demonstrates that injection of partially known information at an intermediate layer in a deep neural network can improve model accuracy, reduce model uncertainty, and yield improved convergence during the training.
The value of these physics-guided neural networks has been demonstrated by learning the dynamics of a wide variety of nonlinear dynamical systems represented by five well-known equations in nonlinear systems theory.
arXiv Detail & Related papers (2022-05-13T19:06:36Z) - Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge
Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC)
We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer.
Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z) - Inverse-Dirichlet Weighting Enables Reliable Training of Physics
Informed Neural Networks [2.580765958706854]
We describe and remedy a failure mode that may arise from multi-scale dynamics with scale imbalances during training of deep neural networks.
PINNs are popular machine-learning templates that allow for seamless integration of physical equation models with data.
For inverse modeling using sequential training, we find that inverse-Dirichlet weighting protects a PINN against catastrophic forgetting.
arXiv Detail & Related papers (2021-07-02T10:01:37Z) - Rank-R FNN: A Tensor-Based Learning Model for High-Order Data
Classification [69.26747803963907]
Rank-R Feedforward Neural Network (FNN) is a tensor-based nonlinear learning model that imposes Canonical/Polyadic decomposition on its parameters.
First, it handles inputs as multilinear arrays, bypassing the need for vectorization, and can thus fully exploit the structural information along every data dimension.
We establish the universal approximation and learnability properties of Rank-R FNN, and we validate its performance on real-world hyperspectral datasets.
arXiv Detail & Related papers (2021-04-11T16:37:32Z) - A Meta-Learning Approach to the Optimal Power Flow Problem Under
Topology Reconfigurations [69.73803123972297]
We propose a DNN-based OPF predictor that is trained using a meta-learning (MTL) approach.
The developed OPF-predictor is validated through simulations using benchmark IEEE bus systems.
arXiv Detail & Related papers (2020-12-21T17:39:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.