Exploring the Potential of Wireless-enabled Multi-Chip AI Accelerators
- URL: http://arxiv.org/abs/2501.17567v1
- Date: Wed, 29 Jan 2025 11:00:09 GMT
- Title: Exploring the Potential of Wireless-enabled Multi-Chip AI Accelerators
- Authors: Emmanuel Irabor, Mariam Musavi, Abhijit Das, Sergi Abadal,
- Abstract summary: We show that wireless interconnects can lead to speedups of 10% on average and 20% maximum.
We highlight the importance of load balancing between the wired and wireless interconnects.
- Score: 2.2305608711864555
- License:
- Abstract: The insatiable appetite of Artificial Intelligence (AI) workloads for computing power is pushing the industry to develop faster and more efficient accelerators. The rigidity of custom hardware, however, conflicts with the need for scalable and versatile architectures capable of catering to the needs of the evolving and heterogeneous pool of Machine Learning (ML) models in the literature. In this context, multi-chiplet architectures assembling multiple (perhaps heterogeneous) accelerators are an appealing option that is unfortunately hindered by the still rigid and inefficient chip-to-chip interconnects. In this paper, we explore the potential of wireless technology as a complement to existing wired interconnects in this multi-chiplet approach. Using an evaluation framework from the state-of-the-art, we show that wireless interconnects can lead to speedups of 10% on average and 20% maximum. We also highlight the importance of load balancing between the wired and wireless interconnects, which will be further explored in future work.
Related papers
- Current Opinions on Memristor-Accelerated Machine Learning Hardware [6.670055193544993]
This manuscript reviews the current status of memristor-based machine learning accelerators.
It discusses our opinion on current key challenges that remain in this field, such as device variation, the need for efficient peripheral circuitry, and systematic co-design and optimization.
Memristor-based accelerators could significantly advance the capabilities of AI hardware, particularly for edge applications where power efficiency is paramount.
arXiv Detail & Related papers (2025-01-22T05:10:47Z) - SCAR: Scheduling Multi-Model AI Workloads on Heterogeneous Multi-Chiplet Module Accelerators [12.416683044819955]
Multi-model workloads with heavy models like recent large language models significantly increased the compute and memory demands on hardware.
To address such increasing demands, designing a scalable hardware architecture became a key problem.
We develop a set of schedulers to navigate the huge scheduling space and codify them into a scheduler, SCAR, with advanced techniques such as inter-chiplet pipelining.
arXiv Detail & Related papers (2024-05-01T18:02:25Z) - Artificial Intelligence Empowered Multiple Access for Ultra Reliable and
Low Latency THz Wireless Networks [76.89730672544216]
Terahertz (THz) wireless networks are expected to catalyze the beyond fifth generation (B5G) era.
To satisfy the ultra-reliability and low-latency demands of several B5G applications, novel mobility management approaches are required.
This article presents a holistic MAC layer approach that enables intelligent user association and resource allocation, as well as flexible and adaptive mobility management.
arXiv Detail & Related papers (2022-08-17T03:00:24Z) - Pervasive Machine Learning for Smart Radio Environments Enabled by
Reconfigurable Intelligent Surfaces [56.35676570414731]
The emerging technology of Reconfigurable Intelligent Surfaces (RISs) is provisioned as an enabler of smart wireless environments.
RISs offer a highly scalable, low-cost, hardware-efficient, and almost energy-neutral solution for dynamic control of the propagation of electromagnetic signals over the wireless medium.
One of the major challenges with the envisioned dense deployment of RISs in such reconfigurable radio environments is the efficient configuration of multiple metasurfaces.
arXiv Detail & Related papers (2022-05-08T06:21:33Z) - Multicore Quantum Computing [0.0]
We explore interlinked multicore architectures through analytic and numerical modelling.
We model shuttling and microwave-based interlinks and estimate the achievable fidelities, finding values that are encouraging but markedly inferior to intra-core operations.
We then assess the prospects for quantum advantage using such devices in the NISQ-era and beyond.
arXiv Detail & Related papers (2022-01-21T19:00:15Z) - Collaborative Learning over Wireless Networks: An Introductory Overview [84.09366153693361]
We will mainly focus on collaborative training across wireless devices.
Many distributed optimization algorithms have been developed over the last decades.
They provide data locality; that is, a joint model can be trained collaboratively while the data available at each participating device remains local.
arXiv Detail & Related papers (2021-12-07T20:15:39Z) - Computational Intelligence and Deep Learning for Next-Generation
Edge-Enabled Industrial IoT [51.68933585002123]
We investigate how to deploy computational intelligence and deep learning (DL) in edge-enabled industrial IoT networks.
In this paper, we propose a novel multi-exit-based federated edge learning (ME-FEEL) framework.
In particular, the proposed ME-FEEL can achieve an accuracy gain up to 32.7% in the industrial IoT networks with the severely limited resources.
arXiv Detail & Related papers (2021-10-28T08:14:57Z) - Towards Hybrid Classical-Quantum Computation Structures in
Wirelessly-Networked Systems [6.63697821097848]
This paper explores the boundary between the two types of computation---classical-quantum hybrid processing for optimization problems in wireless systems.
We explore the feasibility of a hybrid system with a real hardware prototype using one of the most advanced experimentally available techniques.
arXiv Detail & Related papers (2020-10-01T21:00:12Z) - One-step regression and classification with crosspoint resistive memory
arrays [62.997667081978825]
High speed, low energy computing machines are in demand to enable real-time artificial intelligence at the edge.
One-step learning is supported by simulations of the prediction of the cost of a house in Boston and the training of a 2-layer neural network for MNIST digit recognition.
Results are all obtained in one computational step, thanks to the physical, parallel, and analog computing within the crosspoint array.
arXiv Detail & Related papers (2020-05-05T08:00:07Z) - Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G
Networks [84.2155885234293]
We first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC.
To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC.
arXiv Detail & Related papers (2020-02-22T14:38:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.