Neural Tangent Knowledge Distillation for Optical Convolutional Networks
- URL: http://arxiv.org/abs/2508.08421v1
- Date: Mon, 11 Aug 2025 19:15:06 GMT
- Title: Neural Tangent Knowledge Distillation for Optical Convolutional Networks
- Authors: Jinlin Xiang, Minho Choi, Yubo Zhang, Zhihao Zhou, Arka Majumdar, Eli Shlizerman,
- Abstract summary: Hybrid Optical Neural Networks (ONNs) offer an energy-efficient alternative to fully digital deep networks for real-time, power-constrained systems.<n>Their adoption is limited by two main challenges: the accuracy gap compared to large-scale networks during training, and discrepancies between simulated and fabricated systems.<n>We propose a task-agnostic and hardware-agnostic pipeline that supports image classification and segmentation across diverse optical systems.
- Score: 8.526823315100764
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hybrid Optical Neural Networks (ONNs, typically consisting of an optical frontend and a digital backend) offer an energy-efficient alternative to fully digital deep networks for real-time, power-constrained systems. However, their adoption is limited by two main challenges: the accuracy gap compared to large-scale networks during training, and discrepancies between simulated and fabricated systems that further degrade accuracy. While previous work has proposed end-to-end optimizations for specific datasets (e.g., MNIST) and optical systems, these approaches typically lack generalization across tasks and hardware designs. To address these limitations, we propose a task-agnostic and hardware-agnostic pipeline that supports image classification and segmentation across diverse optical systems. To assist optical system design before training, we estimate achievable model accuracy based on user-specified constraints such as physical size and the dataset. For training, we introduce Neural Tangent Knowledge Distillation (NTKD), which aligns optical models with electronic teacher networks, thereby narrowing the accuracy gap. After fabrication, NTKD also guides fine-tuning of the digital backend to compensate for implementation errors. Experiments on multiple datasets (e.g., MNIST, CIFAR, Carvana Masking) and hardware configurations show that our pipeline consistently improves ONN performance and enables practical deployment in both pre-fabrication simulations and physical implementations.
Related papers
- In situ fine-tuning of in silico trained Optical Neural Networks [0.4374837991804086]
Training Optical Neural Networks (ONNs) poses unique challenges, notably the reliance on simplified in silico models.<n>In this paper, we analyze how noise misspecification during in silico training impacts ONN performance.<n>We introduce Gradient-Informed Fine-Tuning (GIFT), a lightweight algorithm designed to mitigate this performance degradation.
arXiv Detail & Related papers (2025-06-27T11:00:36Z) - Nonlinear Computation with Linear Optics via Source-Position Encoding [0.0]
We introduce a novel method to achieve nonlinear computation in fully linear media.<n>Our method can operate at low power and requires only the ability to drive the optical system at a data-dependent spatial position.<n>We formulate a fully automated, topology-optimization-based hardware design framework for extremely specialized optical neural networks.
arXiv Detail & Related papers (2025-04-29T03:55:05Z) - Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Random resistive memory-based deep extreme point learning machine for
unified visual processing [67.51600474104171]
We propose a novel hardware-software co-design, random resistive memory-based deep extreme point learning machine (DEPLM)
Our co-design system achieves huge energy efficiency improvements and training cost reduction when compared to conventional systems.
arXiv Detail & Related papers (2023-12-14T09:46:16Z) - Forward-Forward Training of an Optical Neural Network [6.311461340782698]
We present an experiment utilizing multimode nonlinear wave propagation in an optical fiber demonstrating the feasibility of the FFA approach using an optical system.
The results show that incorporating optical transforms in multilayer NN architectures trained with the FFA, can lead to performance improvements.
arXiv Detail & Related papers (2023-05-30T16:15:57Z) - Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures.
This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead.
We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z) - Physics-aware Differentiable Discrete Codesign for Diffractive Optical
Neural Networks [12.952987240366781]
This work proposes a novel device-to-system hardware-software codesign framework, which enables efficient training of Diffractive optical neural networks (DONNs)
Gumbel-Softmax is employed to enable differentiable discrete mapping from real-world device parameters into the forward function of DONNs.
The results have demonstrated that our proposed framework offers significant advantages over conventional quantization-based methods.
arXiv Detail & Related papers (2022-09-28T17:13:28Z) - Single-Shot Optical Neural Network [55.41644538483948]
'Weight-stationary' analog optical and electronic hardware has been proposed to reduce the compute resources required by deep neural networks.
We present a scalable, single-shot-per-layer weight-stationary optical processor.
arXiv Detail & Related papers (2022-05-18T17:49:49Z) - Accelerating deep neural networks for efficient scene understanding in
automotive cyber-physical systems [2.4373900721120285]
Automotive Cyber-Physical Systems (ACPS) have attracted a significant amount of interest in the past few decades.
One of the most critical operations in these systems is the perception of the environment.
Deep learning and, especially, the use of Deep Neural Networks (DNNs) provides impressive results in analyzing and understanding complex and dynamic scenes from visual data.
arXiv Detail & Related papers (2021-07-19T18:43:17Z) - Low-Precision Training in Logarithmic Number System using Multiplicative
Weight Update [49.948082497688404]
Training large-scale deep neural networks (DNNs) currently requires a significant amount of energy, leading to serious environmental impacts.
One promising approach to reduce the energy costs is representing DNNs with low-precision numbers.
We jointly design a lowprecision training framework involving a logarithmic number system (LNS) and a multiplicative weight update training method, termed LNS-Madam.
arXiv Detail & Related papers (2021-06-26T00:32:17Z) - Multi-Exit Semantic Segmentation Networks [78.44441236864057]
We propose a framework for converting state-of-the-art segmentation models to MESS networks.
specially trained CNNs that employ parametrised early exits along their depth to save during inference on easier samples.
We co-optimise the number, placement and architecture of the attached segmentation heads, along with the exit policy, to adapt to the device capabilities and application-specific requirements.
arXiv Detail & Related papers (2021-06-07T11:37:03Z) - Deep Multi-Task Learning for Cooperative NOMA: System Design and
Principles [52.79089414630366]
We develop a novel deep cooperative NOMA scheme, drawing upon the recent advances in deep learning (DL)
We develop a novel hybrid-cascaded deep neural network (DNN) architecture such that the entire system can be optimized in a holistic manner.
arXiv Detail & Related papers (2020-07-27T12:38:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.