L2ight: Enabling On-Chip Learning for Optical Neural Networks via
Efficient in-situ Subspace Optimization
- URL: http://arxiv.org/abs/2110.14807v1
- Date: Wed, 27 Oct 2021 22:53:47 GMT
- Title: L2ight: Enabling On-Chip Learning for Optical Neural Networks via
Efficient in-situ Subspace Optimization
- Authors: Jiaqi Gu, Hanqing Zhu, Chenghao Feng, Zixuan Jiang, Ray T. Chen, David
Z. Pan
- Abstract summary: Silicon-photonics-based optical neural network (ONN) is a promising hardware platform that could represent a paradigm shift in efficient AI.
In this work, we propose a closed-loop ONN on-chip learning framework L2ight to enable scalable ONN mapping and efficient in-situ learning.
- Score: 10.005026783940682
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Silicon-photonics-based optical neural network (ONN) is a promising hardware
platform that could represent a paradigm shift in efficient AI with its
CMOS-compatibility, flexibility, ultra-low execution latency, and high energy
efficiency. In-situ training on the online programmable photonic chips is
appealing but still encounters challenging issues in on-chip implementability,
scalability, and efficiency. In this work, we propose a closed-loop ONN on-chip
learning framework L2ight to enable scalable ONN mapping and efficient in-situ
learning. L2ight adopts a three-stage learning flow that first calibrates the
complicated photonic circuit states under challenging physical constraints,
then performs photonic core mapping via combined analytical solving and
zeroth-order optimization. A subspace learning procedure with multi-level
sparsity is integrated into L2ight to enable in-situ gradient evaluation and
fast adaptation, unleashing the power of optics for real on-chip intelligence.
Extensive experiments demonstrate our proposed L2ight outperforms prior ONN
training protocols with 3-order-of-magnitude higher scalability and over 30X
better efficiency, when benchmarked on various models and learning tasks. This
synergistic framework is the first scalable on-chip learning solution that
pushes this emerging field from intractable to scalable and further to
efficient for next-generation self-learnable photonic neural chips. From a
co-design perspective, L2ight also provides essential insights for
hardware-restricted unitary subspace optimization and efficient sparse
training. We open-source our framework at
https://github.com/JeremieMelo/L2ight.
Related papers
- FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression [55.992528247880685]
Decentralized training faces significant challenges regarding system design and efficiency.
We present FusionLLM, a decentralized training system designed and implemented for training large deep neural networks (DNNs)
We show that our system and method can achieve 1.45 - 9.39x speedup compared to baseline methods while ensuring convergence.
arXiv Detail & Related papers (2024-10-16T16:13:19Z) - How Feature Learning Can Improve Neural Scaling Laws [86.9540615081759]
We develop a solvable model of neural scaling laws beyond the kernel limit.
We show how performance scales with model size, training time, and the total amount of available data.
arXiv Detail & Related papers (2024-09-26T14:05:32Z) - Optical training of large-scale Transformers and deep neural networks with direct feedback alignment [48.90869997343841]
We experimentally implement a versatile and scalable training algorithm, called direct feedback alignment, on a hybrid electronic-photonic platform.
An optical processing unit performs large-scale random matrix multiplications, which is the central operation of this algorithm, at speeds up to 1500 TeraOps.
We study the compute scaling of our hybrid optical approach, and demonstrate a potential advantage for ultra-deep and wide neural networks.
arXiv Detail & Related papers (2024-09-01T12:48:47Z) - DNN Partitioning, Task Offloading, and Resource Allocation in Dynamic Vehicular Networks: A Lyapunov-Guided Diffusion-Based Reinforcement Learning Approach [49.56404236394601]
We formulate the problem of joint DNN partitioning, task offloading, and resource allocation in Vehicular Edge Computing.
Our objective is to minimize the DNN-based task completion time while guaranteeing the system stability over time.
We propose a Multi-Agent Diffusion-based Deep Reinforcement Learning (MAD2RL) algorithm, incorporating the innovative use of diffusion models.
arXiv Detail & Related papers (2024-06-11T06:31:03Z) - Real-Time FJ/MAC PDE Solvers via Tensorized, Back-Propagation-Free
Optical PINN Training [5.809283001227614]
This paper develops an on-chip training framework for physics-informed neural networks (PINNs)
It aims to solve high-dimensional PDEs with fJ/MAC photonic power consumption and ultra-low latency.
This is the first real-size optical PINN training framework that can be applied to solve high-dimensional PDEs.
arXiv Detail & Related papers (2023-12-31T07:10:15Z) - On-Device Learning with Binary Neural Networks [2.7040098749051635]
We propose a CL solution that embraces the recent advancements in CL field and the efficiency of the Binary Neural Networks (BNN)
The choice of a binary network as backbone is essential to meet the constraints of low power devices.
arXiv Detail & Related papers (2023-08-29T13:48:35Z) - Symbolic Learning to Optimize: Towards Interpretability and Scalability [113.23813868412954]
Recent studies on Learning to Optimize (L2O) suggest a promising path to automating and accelerating the optimization procedure for complicated tasks.
Existing L2O models parameterize optimization rules by neural networks, and learn those numerical rules via meta-training.
In this paper, we establish a holistic symbolic representation and analysis framework for L2O.
We propose a lightweight L2O model that can be meta-trained on large-scale problems and outperformed human-designed and tuneds.
arXiv Detail & Related papers (2022-03-13T06:04:25Z) - Silicon photonic subspace neural chip for hardware-efficient deep
learning [11.374005508708995]
optical neural network (ONN) is a promising candidate for next-generation neurocomputing.
We devise a hardware-efficient photonic subspace neural network architecture.
We experimentally demonstrate our PSNN on a butterfly-style programmable silicon photonic integrated circuit.
arXiv Detail & Related papers (2021-11-11T06:34:05Z) - 3U-EdgeAI: Ultra-Low Memory Training, Ultra-Low BitwidthQuantization,
and Ultra-Low Latency Acceleration [8.419854797930668]
Deep neural network (DNN) based AI applications on the edge require both low-cost computing platforms and high-quality services.
This paper emphasizes the importance of training, quantization and accelerator design, and calls for more research breakthroughs in the area for AI on the edge.
arXiv Detail & Related papers (2021-05-11T03:22:30Z) - Efficient On-Chip Learning for Optical Neural Networks Through
Power-Aware Sparse Zeroth-Order Optimization [12.052076188811052]
Optical neural networks (ONNs) have demonstrated record-breaking potential in neuromorphic computing.
We propose a novel on-chip learning framework to release the full potential of ONNs for power-efficient in situ training.
arXiv Detail & Related papers (2020-12-21T07:00:39Z) - Optimizing Memory Placement using Evolutionary Graph Reinforcement
Learning [56.83172249278467]
We introduce Evolutionary Graph Reinforcement Learning (EGRL), a method designed for large search spaces.
We train and validate our approach directly on the Intel NNP-I chip for inference.
We additionally achieve 28-78% speed-up compared to the native NNP-I compiler on all three workloads.
arXiv Detail & Related papers (2020-07-14T18:50:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.