WDMoE: Wireless Distributed Large Language Models with Mixture of Experts
- URL: http://arxiv.org/abs/2405.03131v1
- Date: Mon, 6 May 2024 02:55:50 GMT
- Title: WDMoE: Wireless Distributed Large Language Models with Mixture of Experts
- Authors: Nan Xue, Yaping Sun, Zhiyong Chen, Meixia Tao, Xiaodong Xu, Liang Qian, Shuguang Cui, Ping Zhang,
- Abstract summary: We propose a wireless distributed Large Language Models (LLMs) paradigm based on Mixture of Experts (MoE)
We decompose the MoE layer in LLMs by deploying the gating network and the preceding neural network layer at base station (BS) and mobile devices.
We design an expert selection policy by taking into account both the performance of the model and the end-to-end latency.
- Score: 65.57581050707738
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have achieved significant success in various natural language processing tasks, but how wireless communications can support LLMs has not been extensively studied. In this paper, we propose a wireless distributed LLMs paradigm based on Mixture of Experts (MoE), named WDMoE, deploying LLMs collaboratively across edge servers of base station (BS) and mobile devices in the wireless communications system. Specifically, we decompose the MoE layer in LLMs by deploying the gating network and the preceding neural network layer at BS, while distributing the expert networks across the devices. This arrangement leverages the parallel capabilities of expert networks on distributed devices. Moreover, to overcome the instability of wireless communications, we design an expert selection policy by taking into account both the performance of the model and the end-to-end latency, which includes both transmission delay and inference delay. Evaluations conducted across various LLMs and multiple datasets demonstrate that WDMoE not only outperforms existing models, such as Llama 2 with 70 billion parameters, but also significantly reduces end-to-end latency.
Related papers
- R-SFLLM: Jamming Resilient Framework for Split Federated Learning with Large Language Models [83.77114091471822]
Split federated learning (SFL) is a compute-efficient paradigm in distributed machine learning (ML)
A challenge in SFL, particularly when deployed over wireless channels, is the susceptibility of transmitted model parameters to adversarial jamming.
This is particularly pronounced for word embedding parameters in large language models (LLMs), which are crucial for language understanding.
A physical layer framework is developed for resilient SFL with LLMs (R-SFLLM) over wireless networks.
arXiv Detail & Related papers (2024-07-16T12:21:29Z) - Efficient Prompting for LLM-based Generative Internet of Things [88.84327500311464]
Large language models (LLMs) have demonstrated remarkable capacities on various tasks.
We propose a text-based generative IoT (GIoT) system deployed in the local network setting.
arXiv Detail & Related papers (2024-06-14T19:24:00Z) - Large Language Models (LLMs) Assisted Wireless Network Deployment in Urban Settings [0.21847754147782888]
Large Language Models (LLMs) have revolutionized language understanding and human-like text generation.
This paper explores new techniques to harness the power of LLMs for 6G (6th Generation) wireless communication technologies.
We introduce a novel Reinforcement Learning (RL) based framework that leverages LLMs for network deployment in wireless communications.
arXiv Detail & Related papers (2024-05-22T05:19:51Z) - NetLLM: Adapting Large Language Models for Networking [36.61572542761661]
We present NetLLM, the first framework that efficiently adapts large language models to solve networking problems.
We demonstrate the effectiveness of NetLLM in LLM adaptation for networking, and showcase that the adapted LLM significantly outperforms state-of-the-art algorithms.
arXiv Detail & Related papers (2024-02-04T04:21:34Z) - Large Multi-Modal Models (LMMs) as Universal Foundation Models for
AI-Native Wireless Systems [57.41621687431203]
Large language models (LLMs) and foundation models have been recently touted as a game-changer for 6G systems.
This paper presents a comprehensive vision on how to design universal foundation models tailored towards the deployment of artificial intelligence (AI)-native networks.
arXiv Detail & Related papers (2024-01-30T00:21:41Z) - LLM-Twin: Mini-Giant Model-driven Beyond 5G Digital Twin Networking
Framework with Semantic Secure Communication and Computation [5.863586088644696]
We propose a large language model (LLM) empowered DTNs networking framework, LLM-Twin.
First, we design the mini-giant model collaboration scheme to achieve efficient deployment of LLM in DTNs.
Then, we design a semantic-level high-efficiency, and secure communication model for DTNs.
arXiv Detail & Related papers (2023-12-17T07:13:59Z) - Quantized Federated Learning under Transmission Delay and Outage
Constraints [30.892724364965005]
Federated learning is a viable distributed learning paradigm which trains a machine learning model collaboratively with massive mobile devices in the wireless edge.
In practical systems with limited radio resources, transmission of a large number of model parameters inevitably suffers from quantization errors (QE) and transmission outage (TO)
We propose a robust FL scheme, named FedTOE, which performs joint allocation of wireless resources and quantization bits across the clients to minimize the QE while making the clients have the same TO probability.
arXiv Detail & Related papers (2021-06-17T11:29:12Z) - Distributed Learning in Wireless Networks: Recent Progress and Future
Challenges [170.35951727508225]
Next-generation wireless networks will enable many machine learning (ML) tools and applications to analyze various types of data collected by edge devices.
Distributed learning and inference techniques have been proposed as a means to enable edge devices to collaboratively train ML models without raw data exchanges.
This paper provides a comprehensive study of how distributed learning can be efficiently and effectively deployed over wireless edge networks.
arXiv Detail & Related papers (2021-04-05T20:57:56Z) - Wireless for Machine Learning [91.13476340719087]
We give an exhaustive review of the state-of-the-art wireless methods that are specifically designed to support machine learning services over distributed datasets.
There are two clear themes within the literature, analog over-the-air computation and digital radio resource management optimized for ML.
This survey gives a comprehensive introduction to these methods, reviews the most important works, highlights open problems, and discusses application scenarios.
arXiv Detail & Related papers (2020-08-31T11:09:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.