In situ fine-tuning of in silico trained Optical Neural Networks
- URL: http://arxiv.org/abs/2506.22122v1
- Date: Fri, 27 Jun 2025 11:00:36 GMT
- Title: In situ fine-tuning of in silico trained Optical Neural Networks
- Authors: Gianluca Kosmella, Ripalta Stabile, Jaron Sanders,
- Abstract summary: Training Optical Neural Networks (ONNs) poses unique challenges, notably the reliance on simplified in silico models.<n>In this paper, we analyze how noise misspecification during in silico training impacts ONN performance.<n>We introduce Gradient-Informed Fine-Tuning (GIFT), a lightweight algorithm designed to mitigate this performance degradation.
- Score: 0.4374837991804086
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Optical Neural Networks (ONNs) promise significant advantages over traditional electronic neural networks, including ultrafast computation, high bandwidth, and low energy consumption, by leveraging the intrinsic capabilities of photonics. However, training ONNs poses unique challenges, notably the reliance on simplified in silico models whose trained parameters must subsequently be mapped to physical hardware. This process often introduces inaccuracies due to discrepancies between the idealized digital model and the physical ONN implementation, particularly stemming from noise and fabrication imperfections. In this paper, we analyze how noise misspecification during in silico training impacts ONN performance and we introduce Gradient-Informed Fine-Tuning (GIFT), a lightweight algorithm designed to mitigate this performance degradation. GIFT uses gradient information derived from the noise structure of the ONN to adapt pretrained parameters directly in situ, without requiring expensive retraining or complex experimental setups. GIFT comes with formal conditions under which it improves ONN performance. We also demonstrate the effectiveness of GIFT via simulation on a five-layer feed forward ONN trained on the MNIST digit classification task. GIFT achieves up to $28\%$ relative accuracy improvement compared to the baseline performance under noise misspecification, without resorting to costly retraining. Overall, GIFT provides a practical solution for bridging the gap between simplified digital models and real-world ONN implementations.
Related papers
- FEM-Informed Hypergraph Neural Networks for Efficient Elastoplasticity [3.211924713637672]
Graph neural networks (GNNs) naturally align with sparse operators and unstructured discretizations.<n>Motivated by discrete physics losses, we embed finite-element computations at nodes and Gauss points directly into message-passing layers.<n>We propose a numerically consistent FEM-Informed Hypergraph Neural Networks (FHGNN)
arXiv Detail & Related papers (2026-02-07T05:11:12Z) - A Self-Ensemble Inspired Approach for Effective Training of Binary-Weight Spiking Neural Networks [66.80058515743468]
Training Spiking Neural Networks (SNNs) and Binary Neural Networks (BNNs) is challenging because of the non-differentiable spike generation function.<n>We present a novel perspective on the dynamics of SNNs and their close connection to BNNs through an analysis of the backpropagation process.<n>Specifically, we leverage a structure of multiple shortcuts and a knowledge distillation-based training technique to improve the training of (binary-weight) SNNs.
arXiv Detail & Related papers (2025-08-18T04:11:06Z) - FFGAF-SNN: The Forward-Forward Based Gradient Approximation Free Training Framework for Spiking Neural Networks [7.310627646090302]
Spiking Neural Networks (SNNs) offer a biologically plausible framework for energy-efficient neuromorphic computing.<n>It is a challenge to train SNNs due to their non-differentiability, efficiently.<n>We propose a Forward-Forward (FF) based gradient approximation-free training framework for Spiking Neural Networks.
arXiv Detail & Related papers (2025-07-31T15:22:23Z) - Self-cross Feature based Spiking Neural Networks for Efficient Few-shot Learning [16.156610945877986]
We propose a few-shot learning framework based on Spiking Neural Networks (SNNs)<n>We apply the combination of temporal efficient training loss and Info Info loss to optimize the temporal spike dynamics of trains and enhance the discriminative power.
arXiv Detail & Related papers (2025-05-12T16:51:08Z) - Towards Accurate Binary Spiking Neural Networks: Learning with Adaptive Gradient Modulation Mechanism [14.425611637823511]
Binary Spiking Neural Networks (BSNNs) inherit the eventdriven paradigm of SNNs, while also adopting the reduced storage burden of binarization techniques.<n>These distinct advantages grant BSNNs lightweight and energy-efficient characteristics, rendering them ideal for deployment on resource-constrained edge devices.<n>However, due to the binary synaptic weights and non-differentiable spike function, effectively training BSNNs remains an open question.
arXiv Detail & Related papers (2025-02-20T07:59:08Z) - Efficient Logit-based Knowledge Distillation of Deep Spiking Neural Networks for Full-Range Timestep Deployment [10.026742974971189]
Spiking Neural Networks (SNNs) are emerging as a brain-inspired alternative to traditional Artificial Neural Networks (ANNs)<n>Despite this, SNNs often suffer from accuracy compared to ANNs and face deployment challenges due to inference timesteps.<n>We propose a novel distillation framework for deep SNNs that optimize performance across full-range timesteps without specific retraining.
arXiv Detail & Related papers (2025-01-27T10:22:38Z) - Optimizing the Optimizer for Physics-Informed Neural Networks and Kolmogorov-Arnold Networks [3.758814046658822]
Physics-Informed Neural Networks (PINNs) have revolutionized the computation PDE solutions by integrating partialmagnitude equations (PDEs) into the neural network's training process as soft constraints.<n>More, physics-informed networks (PIKANs) have also been effective and comparable in accuracy.
arXiv Detail & Related papers (2025-01-22T21:19:42Z) - Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval.<n>A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed.<n>The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - Physics-aware Differentiable Discrete Codesign for Diffractive Optical
Neural Networks [12.952987240366781]
This work proposes a novel device-to-system hardware-software codesign framework, which enables efficient training of Diffractive optical neural networks (DONNs)
Gumbel-Softmax is employed to enable differentiable discrete mapping from real-world device parameters into the forward function of DONNs.
The results have demonstrated that our proposed framework offers significant advantages over conventional quantization-based methods.
arXiv Detail & Related papers (2022-09-28T17:13:28Z) - Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors.
Our work is the first attempt to optimize BNNs from the bilinear perspective.
We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z) - Low-Precision Training in Logarithmic Number System using Multiplicative
Weight Update [49.948082497688404]
Training large-scale deep neural networks (DNNs) currently requires a significant amount of energy, leading to serious environmental impacts.
One promising approach to reduce the energy costs is representing DNNs with low-precision numbers.
We jointly design a lowprecision training framework involving a logarithmic number system (LNS) and a multiplicative weight update training method, termed LNS-Madam.
arXiv Detail & Related papers (2021-06-26T00:32:17Z) - Learning to Solve the AC-OPF using Sensitivity-Informed Deep Neural
Networks [52.32646357164739]
We propose a deep neural network (DNN) to solve the solutions of the optimal power flow (ACOPF)
The proposed SIDNN is compatible with a broad range of OPF schemes.
It can be seamlessly integrated in other learning-to-OPF schemes.
arXiv Detail & Related papers (2021-03-27T00:45:23Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.