LAMBO: Large AI Model Empowered Edge Intelligence
- URL: http://arxiv.org/abs/2308.15078v2
- Date: Sat, 3 Aug 2024 13:43:01 GMT
- Title: LAMBO: Large AI Model Empowered Edge Intelligence
- Authors: Li Dong, Feibo Jiang, Yubo Peng, Kezhi Wang, Kun Yang, Cunhua Pan, Robert Schober,
- Abstract summary: Next-generation edge intelligence is anticipated to benefit various applications via offloading techniques.
Traditional offloading architectures face several issues, including heterogeneous constraints, partial perception, uncertain generalization, and lack of tractability.
We propose a Large AI Model-Based Offloading (LAMBO) framework with over one billion parameters for solving these problems.
- Score: 71.56135386994119
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Next-generation edge intelligence is anticipated to benefit various applications via offloading techniques. However, traditional offloading architectures face several issues, including heterogeneous constraints, partial perception, uncertain generalization, and lack of tractability. In this paper, we propose a Large AI Model-Based Offloading (LAMBO) framework with over one billion parameters for solving these problems. We first use input embedding (IE) to achieve normalized feature representation with heterogeneous constraints and task prompts. Then, we introduce a novel asymmetric encoder-decoder (AED) as the decision-making model, which is an improved transformer architecture consisting of a deep encoder and a shallow decoder for global perception and decision. Next, actor-critic learning (ACL) is used to pre-train the AED for different optimization tasks under corresponding prompts, enhancing the AED's generalization in multi-task scenarios. Finally, we propose an active learning from expert feedback (ALEF) method to fine-tune the decoder of the AED for tracking changes in dynamic environments. Our simulation results validate the advantages of the proposed LAMBO framework.
Related papers
- Inference Optimization of Foundation Models on AI Accelerators [68.24450520773688]
Powerful foundation models, including large language models (LLMs), with Transformer architectures have ushered in a new era of Generative AI.
As the number of model parameters reaches to hundreds of billions, their deployment incurs prohibitive inference costs and high latency in real-world scenarios.
This tutorial offers a comprehensive discussion on complementary inference optimization techniques using AI accelerators.
arXiv Detail & Related papers (2024-07-12T09:24:34Z) - Large Language Models for Power Scheduling: A User-Centric Approach [6.335540414370735]
We introduce a novel architecture for resource scheduling problems by converting an arbitrary user's voice request (VRQ) into a resource allocation vector.
Specifically, we design an LLM intent recognition agent to translate the request into an optimization problem (OP), an LLM OP parameter identification agent, and an OP solving agent.
arXiv Detail & Related papers (2024-06-29T15:47:28Z) - Improving Generalization of Neural Vehicle Routing Problem Solvers Through the Lens of Model Architecture [9.244633039170186]
We propose a plug-and-play Entropy-based Scaling Factor (ESF) and a Distribution-Specific (DS) decoder.
ESF adjusts the attention weight pattern of the model towards familiar ones discovered during training when solving VRPs of varying sizes.
DS decoder explicitly models VRPs of multiple training distribution patterns through multiple auxiliary light decoders, expanding the model representation space.
arXiv Detail & Related papers (2024-06-10T09:03:17Z) - Exploiting the Layered Intrinsic Dimensionality of Deep Models for Practical Adversarial Training [31.495803865226158]
Adversarial Training (AT) is rarely, if ever, deployed in practical AI systems for two primary reasons.
AT results in a drop in generalization in vision models whereas, in encoder-based language models, generalization either improves or remains unchanged.
We show that SMAAT requires only 25-33% of the GPU time compared to standard AT, while significantly improving robustness across all applications.
arXiv Detail & Related papers (2024-05-27T12:48:30Z) - Machine Learning Insides OptVerse AI Solver: Design Principles and
Applications [74.67495900436728]
We present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI solver.
We showcase our methods for generating complex SAT and MILP instances utilizing generative models that mirror multifaceted structures of real-world problem.
We detail the incorporation of state-of-the-art parameter tuning algorithms which markedly elevate solver performance.
arXiv Detail & Related papers (2024-01-11T15:02:15Z) - When Parameter-efficient Tuning Meets General-purpose Vision-language
Models [65.19127815275307]
PETAL revolutionizes the training process by requiring only 0.5% of the total parameters, achieved through a unique mode approximation technique.
Our experiments reveal that PETAL not only outperforms current state-of-the-art methods in most scenarios but also surpasses full fine-tuning models in effectiveness.
arXiv Detail & Related papers (2023-12-16T17:13:08Z) - Training Latent Variable Models with Auto-encoding Variational Bayes: A
Tutorial [0.0]
Auto-encoding Variational Bayes (AEVB) is a powerful and general algorithm for fitting latent variable models.
In this tutorial, we focus on motivating AEVB from the classic Expectation Maximization (EM) algorithm.
arXiv Detail & Related papers (2022-08-16T16:07:05Z) - Task-Oriented Sensing, Computation, and Communication Integration for
Multi-Device Edge AI [108.08079323459822]
This paper studies a new multi-intelligent edge artificial-latency (AI) system, which jointly exploits the AI model split inference and integrated sensing and communication (ISAC)
We measure the inference accuracy by adopting an approximate but tractable metric, namely discriminant gain.
arXiv Detail & Related papers (2022-07-03T06:57:07Z) - Reconfigurable Intelligent Surface Assisted Mobile Edge Computing with
Heterogeneous Learning Tasks [53.1636151439562]
Mobile edge computing (MEC) provides a natural platform for AI applications.
We present an infrastructure to perform machine learning tasks at an MEC with the assistance of a reconfigurable intelligent surface (RIS)
Specifically, we minimize the learning error of all participating users by jointly optimizing transmit power of mobile users, beamforming vectors of the base station, and the phase-shift matrix of the RIS.
arXiv Detail & Related papers (2020-12-25T07:08:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.