High-performance real-world optical computing trained by in situ gradient-based model-free optimization
- URL: http://arxiv.org/abs/2307.11957v6
- Date: Thu, 21 Nov 2024 16:13:48 GMT
- Title: High-performance real-world optical computing trained by in situ gradient-based model-free optimization
- Authors: Guangyuan Zhao, Xin Shu, Renjie Zhou,
- Abstract summary: We propose a gradient-based model-free optimization (G-MFO) method based on a Monte Carlo gradient estimation algorithm.
G-MFO treats an optical computing system as a black box and back-propagates the loss directly to the optical computing weights' probability distributions.
Our experiments on diffractive optical computing systems show that G-MFO outperforms hybrid training on the MNIST and FMNIST datasets.
- Score: 2.2407602847819827
- License:
- Abstract: Optical computing systems provide high-speed and low-energy data processing but face deficiencies in computationally demanding training and simulation-to-reality gaps. We propose a gradient-based model-free optimization (G-MFO) method based on a Monte Carlo gradient estimation algorithm for computationally efficient in situ training of optical computing systems. This approach treats an optical computing system as a black box and back-propagates the loss directly to the optical computing weights' probability distributions, circumventing the need for a computationally heavy and biased system simulation. Our experiments on diffractive optical computing systems show that G-MFO outperforms hybrid training on the MNIST and FMNIST datasets. Furthermore, we demonstrate image-free and high-speed classification of cells from their marker-free phase maps. Our method's model-free and high-performance nature, combined with its low demand for computational resources, paves the way for accelerating the transition of optical computing from laboratory demonstrations to practical, real-world applications.
Related papers
- Training Hybrid Neural Networks with Multimode Optical Nonlinearities Using Digital Twins [2.8479179029634984]
We introduce ultrashort pulse propagation in multimode fibers, which perform large-scale nonlinear transformations.
Training the hybrid architecture is achieved through a neural model that differentiably approximates the optical system.
Our experimental results achieve state-of-the-art image classification accuracies and simulation fidelity.
arXiv Detail & Related papers (2025-01-14T10:35:18Z) - A parametric framework for kernel-based dynamic mode decomposition using deep learning [0.0]
The proposed framework consists of two stages, offline and online.
The online stage leverages those LANDO models to generate new data at a desired time instant.
dimensionality reduction technique is applied to high-dimensional dynamical systems to reduce the computational cost of training.
arXiv Detail & Related papers (2024-09-25T11:13:50Z) - Cross-Scan Mamba with Masked Training for Robust Spectral Imaging [51.557804095896174]
We propose the Cross-Scanning Mamba, named CS-Mamba, that employs a Spatial-Spectral SSM for global-local balanced context encoding.
Experiment results show that our CS-Mamba achieves state-of-the-art performance and the masked training method can better reconstruct smooth features to improve the visual quality.
arXiv Detail & Related papers (2024-08-01T15:14:10Z) - Optical Extreme Learning Machines with Atomic Vapors [0.3069335774032178]
Extreme learning machines explore nonlinear random projections to perform computing tasks on high-dimensional output spaces.
This manuscript explores the possibility of using atomic gases in near-resonant conditions to implement an optical extreme learning machine.
Our results suggest that these systems have the potential not only to work as an optical extreme learning machine but also to perform these computations at the few-photon level.
arXiv Detail & Related papers (2024-01-08T10:19:28Z) - Gradual Optimization Learning for Conformational Energy Minimization [69.36925478047682]
Gradual Optimization Learning Framework (GOLF) for energy minimization with neural networks significantly reduces the required additional data.
Our results demonstrate that the neural network trained with GOLF performs on par with the oracle on a benchmark of diverse drug-like molecules.
arXiv Detail & Related papers (2023-11-05T11:48:08Z) - Learning Controllable Adaptive Simulation for Multi-resolution Physics [86.8993558124143]
We introduce Learning controllable Adaptive simulation for Multi-resolution Physics (LAMP) as the first full deep learning-based surrogate model.
LAMP consists of a Graph Neural Network (GNN) for learning the forward evolution, and a GNN-based actor-critic for learning the policy of spatial refinement and coarsening.
We demonstrate that our LAMP outperforms state-of-the-art deep learning surrogate models, and can adaptively trade-off computation to improve long-term prediction error.
arXiv Detail & Related papers (2023-05-01T23:20:27Z) - Generic Lithography Modeling with Dual-band Optics-Inspired Neural
Networks [52.200624127512874]
We introduce a dual-band optics-inspired neural network design that considers the optical physics underlying lithography.
Our approach yields the first published via/metal layer contour simulation at 1nm2/pixel resolution with any tile size.
We also achieve 85X simulation speedup over traditional lithography simulator with 1% accuracy loss.
arXiv Detail & Related papers (2022-03-12T08:08:50Z) - Gradient descent in materia through homodyne gradient extraction [2.012950941269354]
We demonstrate a simple yet efficient gradient extraction method, based on the principle of homodyne detection.
By perturbing the parameters that need to be optimized we effectively obtain the gradient information in a highly robust and scalable manner.
Homodyne gradient extraction can in principle be fully implemented in materia, facilitating the development of autonomously learning material systems.
arXiv Detail & Related papers (2021-05-15T12:18:31Z) - Rapid characterisation of linear-optical networks via PhaseLift [51.03305009278831]
Integrated photonics offers great phase-stability and can rely on the large scale manufacturability provided by the semiconductor industry.
New devices, based on such optical circuits, hold the promise of faster and energy-efficient computations in machine learning applications.
We present a novel technique to reconstruct the transfer matrix of linear optical networks.
arXiv Detail & Related papers (2020-10-01T16:04:22Z) - Large-scale neuromorphic optoelectronic computing with a reconfigurable
diffractive processing unit [38.898230519968116]
We propose an optoelectronic reconfigurable computing paradigm by constructing a diffractive processing unit.
It can efficiently support different neural networks and achieve a high model complexity with millions of neurons.
Our prototype system built with off-the-shelf optoelectronic components surpasses the performance of state-of-the-art graphics processing units.
arXiv Detail & Related papers (2020-08-26T16:34:58Z) - Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL.
We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.