Data-driven Modeling of Mach-Zehnder Interferometer-based Optical Matrix
Multipliers
- URL: http://arxiv.org/abs/2210.09171v1
- Date: Mon, 17 Oct 2022 15:19:26 GMT
- Title: Data-driven Modeling of Mach-Zehnder Interferometer-based Optical Matrix
Multipliers
- Authors: Ali Cem, Siqi Yan, Yunhong Ding, Darko Zibar, Francesco Da Ros
- Abstract summary: Photonic integrated circuits are facilitating the development of optical neural networks.
We describe both simple analytical models and data-driven models for offline training of optical matrix multipliers.
The neural network-based models outperform the simple physics-based models in terms of prediction error.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Photonic integrated circuits are facilitating the development of optical
neural networks, which have the potential to be both faster and more energy
efficient than their electronic counterparts since optical signals are
especially well-suited for implementing matrix multiplications. However,
accurate programming of photonic chips for optical matrix multiplication
remains a difficult challenge. Here, we describe both simple analytical models
and data-driven models for offline training of optical matrix multipliers. We
train and evaluate the models using experimental data obtained from a
fabricated chip featuring a Mach-Zehnder interferometer mesh implementing
3-by-3 matrix multiplication. The neural network-based models outperform the
simple physics-based models in terms of prediction error. Furthermore, the
neural network models are also able to predict the spectral variations in the
matrix weights for up to 100 frequency channels covering the C-band. The use of
neural network models for programming the chip for optical matrix
multiplication yields increased performance on multiple machine learning tasks.
Related papers
- Optical training of large-scale Transformers and deep neural networks with direct feedback alignment [48.90869997343841]
We experimentally implement a versatile and scalable training algorithm, called direct feedback alignment, on a hybrid electronic-photonic platform.
An optical processing unit performs large-scale random matrix multiplications, which is the central operation of this algorithm, at speeds up to 1500 TeraOps.
We study the compute scaling of our hybrid optical approach, and demonstrate a potential advantage for ultra-deep and wide neural networks.
arXiv Detail & Related papers (2024-09-01T12:48:47Z) - Mechanistic Neural Networks for Scientific Machine Learning [58.99592521721158]
We present Mechanistic Neural Networks, a neural network design for machine learning applications in the sciences.
It incorporates a new Mechanistic Block in standard architectures to explicitly learn governing differential equations as representations.
Central to our approach is a novel Relaxed Linear Programming solver (NeuRLP) inspired by a technique that reduces solving linear ODEs to solving linear programs.
arXiv Detail & Related papers (2024-02-20T15:23:24Z) - Addressing Data Scarcity in Optical Matrix Multiplier Modeling Using
Transfer Learning [0.0]
We present and experimentally evaluate using transfer learning to address experimental data scarcity.
Our approach involves pre-training the model using synthetic data generated from a less accurate analytical model.
We achieve 1 dB root-mean-square error on the matrix weights implemented by a 3x3 photonic chip while using only 25% of the available data.
arXiv Detail & Related papers (2023-08-10T07:33:00Z) - Data-efficient Modeling of Optical Matrix Multipliers Using Transfer
Learning [0.0]
We demonstrate transfer learning-assisted neural network models for optical matrix multipliers with scarce measurement data.
Our approach uses 10% of experimental data needed for best performance and outperforms analytical models for a Mach-Zehnder interferometer mesh.
arXiv Detail & Related papers (2022-11-29T09:22:42Z) - An Adversarial Active Sampling-based Data Augmentation Framework for
Manufacturable Chip Design [55.62660894625669]
Lithography modeling is a crucial problem in chip design to ensure a chip design mask is manufacturable.
Recent developments in machine learning have provided alternative solutions in replacing the time-consuming lithography simulations with deep neural networks.
We propose a litho-aware data augmentation framework to resolve the dilemma of limited data and improve the machine learning model performance.
arXiv Detail & Related papers (2022-10-27T20:53:39Z) - All-Photonic Artificial Neural Network Processor Via Non-linear Optics [0.0]
We propose an all-photonic artificial neural network processor.
Information is encoded in the amplitudes of frequency modes that act as neurons.
Our architecture is unique in providing a completely unitary, reversible mode of computation.
arXiv Detail & Related papers (2022-05-17T19:55:30Z) - Mixed Precision Low-bit Quantization of Neural Network Language Models
for Speech Recognition [67.95996816744251]
State-of-the-art language models (LMs) represented by long-short term memory recurrent neural networks (LSTM-RNNs) and Transformers are becoming increasingly complex and expensive for practical applications.
Current quantization methods are based on uniform precision and fail to account for the varying performance sensitivity at different parts of LMs to quantization errors.
Novel mixed precision neural network LM quantization methods are proposed in this paper.
arXiv Detail & Related papers (2021-11-29T12:24:02Z) - Comparison of Models for Training Optical Matrix Multipliers in
Neuromorphic PICs [0.0]
We compare simple physics-based vs. data-driven neural-network-based models for offline training of programmable photonic chips.
The neural-network model outperforms physics-based models for a chip with thermal crosstalk, yielding increased testing accuracy.
arXiv Detail & Related papers (2021-11-23T12:15:21Z) - Joint Deep Reinforcement Learning and Unfolding: Beam Selection and
Precoding for mmWave Multiuser MIMO with Lens Arrays [54.43962058166702]
millimeter wave (mmWave) multiuser multiple-input multiple-output (MU-MIMO) systems with discrete lens arrays have received great attention.
In this work, we investigate the joint design of a beam precoding matrix for mmWave MU-MIMO systems with DLA.
arXiv Detail & Related papers (2021-01-05T03:55:04Z) - Compressing LSTM Networks by Matrix Product Operators [7.395226141345625]
Long Short Term Memory(LSTM) models are the building blocks of many state-of-the-art natural language processing(NLP) and speech enhancement(SE) algorithms.
Here we introduce the MPO decomposition, which describes the local correlation of quantum states in quantum many-body physics.
We propose a matrix product operator(MPO) based neural network architecture to replace the LSTM model.
arXiv Detail & Related papers (2020-12-22T11:50:06Z) - Rapid characterisation of linear-optical networks via PhaseLift [51.03305009278831]
Integrated photonics offers great phase-stability and can rely on the large scale manufacturability provided by the semiconductor industry.
New devices, based on such optical circuits, hold the promise of faster and energy-efficient computations in machine learning applications.
We present a novel technique to reconstruct the transfer matrix of linear optical networks.
arXiv Detail & Related papers (2020-10-01T16:04:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.