Addressing Data Scarcity in Optical Matrix Multiplier Modeling Using
Transfer Learning
- URL: http://arxiv.org/abs/2308.11630v2
- Date: Mon, 13 Nov 2023 15:58:10 GMT
- Title: Addressing Data Scarcity in Optical Matrix Multiplier Modeling Using
Transfer Learning
- Authors: Ali Cem, Ognjen Jovanovic, Siqi Yan, Yunhong Ding, Darko Zibar, and
Francesco Da Ros
- Abstract summary: We present and experimentally evaluate using transfer learning to address experimental data scarcity.
Our approach involves pre-training the model using synthetic data generated from a less accurate analytical model.
We achieve 1 dB root-mean-square error on the matrix weights implemented by a 3x3 photonic chip while using only 25% of the available data.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present and experimentally evaluate using transfer learning to address
experimental data scarcity when training neural network (NN) models for
Mach-Zehnder interferometer mesh-based optical matrix multipliers. Our approach
involves pre-training the model using synthetic data generated from a less
accurate analytical model and fine-tuning with experimental data. Our
investigation demonstrates that this method yields significant reductions in
modeling errors compared to using an analytical model, or a standalone NN model
when training data is limited. Utilizing regularization techniques and ensemble
averaging, we achieve < 1 dB root-mean-square error on the matrix weights
implemented by a 3x3 photonic chip while using only 25% of the available data.
Related papers
- Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - Post-training Model Quantization Using GANs for Synthetic Data
Generation [57.40733249681334]
We investigate the use of synthetic data as a substitute for the calibration with real data for the quantization method.
We compare the performance of models quantized using data generated by StyleGAN2-ADA and our pre-trained DiStyleGAN, with quantization using real data and an alternative data generation method based on fractal images.
arXiv Detail & Related papers (2023-05-10T11:10:09Z) - Capturing dynamical correlations using implicit neural representations [85.66456606776552]
We develop an artificial intelligence framework which combines a neural network trained to mimic simulated data from a model Hamiltonian with automatic differentiation to recover unknown parameters from experimental data.
In doing so, we illustrate the ability to build and train a differentiable model only once, which then can be applied in real-time to multi-dimensional scattering data.
arXiv Detail & Related papers (2023-04-08T07:55:36Z) - Data-efficient Modeling of Optical Matrix Multipliers Using Transfer
Learning [0.0]
We demonstrate transfer learning-assisted neural network models for optical matrix multipliers with scarce measurement data.
Our approach uses 10% of experimental data needed for best performance and outperforms analytical models for a Mach-Zehnder interferometer mesh.
arXiv Detail & Related papers (2022-11-29T09:22:42Z) - An Adversarial Active Sampling-based Data Augmentation Framework for
Manufacturable Chip Design [55.62660894625669]
Lithography modeling is a crucial problem in chip design to ensure a chip design mask is manufacturable.
Recent developments in machine learning have provided alternative solutions in replacing the time-consuming lithography simulations with deep neural networks.
We propose a litho-aware data augmentation framework to resolve the dilemma of limited data and improve the machine learning model performance.
arXiv Detail & Related papers (2022-10-27T20:53:39Z) - Data-driven Modeling of Mach-Zehnder Interferometer-based Optical Matrix
Multipliers [0.0]
Photonic integrated circuits are facilitating the development of optical neural networks.
We describe both simple analytical models and data-driven models for offline training of optical matrix multipliers.
The neural network-based models outperform the simple physics-based models in terms of prediction error.
arXiv Detail & Related papers (2022-10-17T15:19:26Z) - Coordinated Double Machine Learning [8.808993671472349]
This paper argues that a carefully coordinated learning algorithm for deep neural networks may reduce the estimation bias.
The improved empirical performance of the proposed method is demonstrated through numerical experiments on both simulated and real data.
arXiv Detail & Related papers (2022-06-02T05:56:21Z) - Mixed Effects Neural ODE: A Variational Approximation for Analyzing the
Dynamics of Panel Data [50.23363975709122]
We propose a probabilistic model called ME-NODE to incorporate (fixed + random) mixed effects for analyzing panel data.
We show that our model can be derived using smooth approximations of SDEs provided by the Wong-Zakai theorem.
We then derive Evidence Based Lower Bounds for ME-NODE, and develop (efficient) training algorithms.
arXiv Detail & Related papers (2022-02-18T22:41:51Z) - Data-Driven Shadowgraph Simulation of a 3D Object [50.591267188664666]
We are replacing the numerical code by a computationally cheaper projection based surrogate model.
The model is able to approximate the electric fields at a given time without computing all preceding electric fields as required by numerical methods.
This model has shown a good quality reconstruction in a problem of perturbation of data within a narrow range of simulation parameters and can be used for input data of large size.
arXiv Detail & Related papers (2021-06-01T08:46:04Z) - A Taylor Based Sampling Scheme for Machine Learning in Computational
Physics [0.0]
We take advantage of the ability to generate data using numerical simulations programs to train Machine Learning models better.
We elaborate a new data sampling scheme based on Taylor approximation to reduce the error of a Deep Neural Network (DNN) when learning the solution of an ordinary differential equations (ODE) system.
arXiv Detail & Related papers (2021-01-20T12:56:09Z) - Exponentially improved detection and correction of errors in
experimental systems using neural networks [0.0]
We introduce the use of two machine learning algorithms to create an empirical model of an experimental apparatus.
This is able to reduce the number of measurements necessary for generic optimisation tasks exponentially.
We demonstrate both algorithms at the example of detecting and compensating stray electric fields in an ion trap.
arXiv Detail & Related papers (2020-05-18T22:42:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.