What Is Next for LLMs? Next-Generation AI Computing Hardware Using Photonic Chips
- URL: http://arxiv.org/abs/2505.05794v1
- Date: Fri, 09 May 2025 05:19:14 GMT
- Title: What Is Next for LLMs? Next-Generation AI Computing Hardware Using Photonic Chips
- Authors: Renjie Li, Wenjie Wei, Qi Xin, Xiaoli Liu, Sixuan Mao, Erik Ma, Zijian Chen, Malu Zhang, Haizhou Li, Zhaoyu Zhang,
- Abstract summary: Large language models (LLMs) are rapidly pushing the limits of contemporary computing hardware.<n>This review surveys emerging photonic hardware optimized for next-generation generative AI computing.
- Score: 34.52960723566363
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Large language models (LLMs) are rapidly pushing the limits of contemporary computing hardware. For example, training GPT-3 has been estimated to consume around 1300 MWh of electricity, and projections suggest future models may require city-scale (gigawatt) power budgets. These demands motivate exploration of computing paradigms beyond conventional von Neumann architectures. This review surveys emerging photonic hardware optimized for next-generation generative AI computing. We discuss integrated photonic neural network architectures (e.g., Mach-Zehnder interferometer meshes, lasers, wavelength-multiplexed microring resonators) that perform ultrafast matrix operations. We also examine promising alternative neuromorphic devices, including spiking neural network circuits and hybrid spintronic-photonic synapses, which combine memory and processing. The integration of two-dimensional materials (graphene, TMDCs) into silicon photonic platforms is reviewed for tunable modulators and on-chip synaptic elements. Transformer-based LLM architectures (self-attention and feed-forward layers) are analyzed in this context, identifying strategies and challenges for mapping dynamic matrix multiplications onto these novel hardware substrates. We then dissect the mechanisms of mainstream LLMs, such as ChatGPT, DeepSeek, and LLaMA, highlighting their architectural similarities and differences. We synthesize state-of-the-art components, algorithms, and integration methods, highlighting key advances and open issues in scaling such systems to mega-sized LLM models. We find that photonic computing systems could potentially surpass electronic processors by orders of magnitude in throughput and energy efficiency, but require breakthroughs in memory, especially for long-context windows and long token sequences, and in storage of ultra-large datasets.
Related papers
- A Hardware-Efficient Photonic Tensor Core: Accelerating Deep Neural Networks with Structured Compression [15.665630650382226]
AI and deep neural networks (DNNs) have revolutionized numerous fields, enabling complex tasks by extracting intricate features from large datasets.<n>Optical computing offers a promising alternative due to its inherent advantages of parallelism, high computational speed, and low power consumption.<n>We introduce a block-circulant photonic tensor core (CirPTC) for a structure-compressed optical neural network (StrC-ONN) architecture.
arXiv Detail & Related papers (2025-02-01T17:03:45Z) - Memory Is All You Need: An Overview of Compute-in-Memory Architectures for Accelerating Large Language Model Inference [2.9302211589186244]
Large language models (LLMs) have transformed natural language processing, enabling machines to generate human-like text and engage in meaningful conversations.
Developments in computing and memory capabilities are lagging behind, exacerbated by the discontinuation of Moore's law.
compute-in-memory (CIM) technologies offer a promising solution for accelerating AI inference by directly performing analog computations in memory.
arXiv Detail & Related papers (2024-06-12T16:57:58Z) - Meent: Differentiable Electromagnetic Simulator for Machine Learning [0.6902278820907753]
Electromagnetic (EM) simulation plays a crucial role in analyzing and designing devices with sub-wavelength scale structures.
Meent is an EM simulation software that employs rigorous coupled-wave analysis (RCWA)
We present three applications of Meent: 1) generating a dataset for training neural operator, 2) serving as an environment for the reinforcement learning of nanophotonic device optimization, and 3) providing a solution for inverse problems with gradient-based gradients.
arXiv Detail & Related papers (2024-06-11T10:00:06Z) - Efficient and accurate neural field reconstruction using resistive memory [52.68088466453264]
Traditional signal reconstruction methods on digital computers face both software and hardware challenges.
We propose a systematic approach with software-hardware co-optimizations for signal reconstruction from sparse inputs.
This work advances the AI-driven signal restoration technology and paves the way for future efficient and robust medical AI and 3D vision applications.
arXiv Detail & Related papers (2024-04-15T09:33:09Z) - Random resistive memory-based deep extreme point learning machine for
unified visual processing [67.51600474104171]
We propose a novel hardware-software co-design, random resistive memory-based deep extreme point learning machine (DEPLM)
Our co-design system achieves huge energy efficiency improvements and training cost reduction when compared to conventional systems.
arXiv Detail & Related papers (2023-12-14T09:46:16Z) - Hyperspectral In-Memory Computing with Optical Frequency Combs and
Programmable Optical Memories [0.0]
Machine learning has amplified the demand for extensive matrix-vector multiplication operations.
We propose a hyperspectral in-memory computing architecture that integrates space multiplexing with frequency multiplexing of optical frequency combs.
We have experimentally demonstrated multiply-accumulate operations with higher than 4-bit precision in both matrix-vector and matrix-matrix multiplications.
arXiv Detail & Related papers (2023-10-17T06:03:45Z) - Interleaving: Modular architectures for fault-tolerant photonic quantum
computing [50.591267188664666]
Photonic fusion-based quantum computing (FBQC) uses low-loss photonic delays.
We present a modular architecture for FBQC in which these components are combined to form "interleaving modules"
Exploiting the multiplicative power of delays, each module can add thousands of physical qubits to the computational Hilbert space.
arXiv Detail & Related papers (2021-03-15T18:00:06Z) - Reservoir Computing with Magnetic Thin Films [35.32223849309764]
New unconventional computing hardware has emerged with the potential to exploit natural phenomena and gain efficiency.
Physical reservoir computing demonstrates this with a variety of unconventional systems.
We perform an initial exploration of three magnetic materials in thin-film geometries via microscale simulation.
arXiv Detail & Related papers (2021-01-29T17:37:17Z) - Photonics for artificial intelligence and neuromorphic computing [52.77024349608834]
Photonic integrated circuits have enabled ultrafast artificial neural networks.
Photonic neuromorphic systems offer sub-nanosecond latencies.
These systems could address the growing demand for machine learning and artificial intelligence.
arXiv Detail & Related papers (2020-10-30T21:41:44Z) - Rapid characterisation of linear-optical networks via PhaseLift [51.03305009278831]
Integrated photonics offers great phase-stability and can rely on the large scale manufacturability provided by the semiconductor industry.
New devices, based on such optical circuits, hold the promise of faster and energy-efficient computations in machine learning applications.
We present a novel technique to reconstruct the transfer matrix of linear optical networks.
arXiv Detail & Related papers (2020-10-01T16:04:22Z) - One-step regression and classification with crosspoint resistive memory
arrays [62.997667081978825]
High speed, low energy computing machines are in demand to enable real-time artificial intelligence at the edge.
One-step learning is supported by simulations of the prediction of the cost of a house in Boston and the training of a 2-layer neural network for MNIST digit recognition.
Results are all obtained in one computational step, thanks to the physical, parallel, and analog computing within the crosspoint array.
arXiv Detail & Related papers (2020-05-05T08:00:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.