Computing in the Era of Large Generative Models: From Cloud-Native to
AI-Native
- URL: http://arxiv.org/abs/2401.12230v1
- Date: Wed, 17 Jan 2024 20:34:11 GMT
- Title: Computing in the Era of Large Generative Models: From Cloud-Native to
AI-Native
- Authors: Yao Lu, Song Bian, Lequn Chen, Yongjun He, Yulong Hui, Matthew Lentz,
Beibin Li, Fei Liu, Jialin Li, Qi Liu, Rui Liu, Xiaoxuan Liu, Lin Ma, Kexin
Rong, Jianguo Wang, Yingjun Wu, Yongji Wu, Huanchen Zhang, Minjia Zhang,
Qizhen Zhang, Tianyi Zhou, Danyang Zhuo
- Abstract summary: We describe an AI-native computing paradigm that harnesses the power of both cloudnative technologies and advanced machine learning inference.
These joint efforts aim to optimize costs-of-goods-sold (COGS) and improve resource accessibility.
- Score: 46.7766555589807
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we investigate the intersection of large generative AI models
and cloud-native computing architectures. Recent large models such as ChatGPT,
while revolutionary in their capabilities, face challenges like escalating
costs and demand for high-end GPUs. Drawing analogies between
large-model-as-a-service (LMaaS) and cloud database-as-a-service (DBaaS), we
describe an AI-native computing paradigm that harnesses the power of both
cloud-native technologies (e.g., multi-tenancy and serverless computing) and
advanced machine learning runtime (e.g., batched LoRA inference). These joint
efforts aim to optimize costs-of-goods-sold (COGS) and improve resource
accessibility. The journey of merging these two domains is just at the
beginning and we hope to stimulate future research and development in this
area.
Related papers
- Cloud Platforms for Developing Generative AI Solutions: A Scoping Review of Tools and Services [0.27649989102029926]
Generative AI is transforming enterprise application development by enabling machines to create content, code, and designs.
Cloud computing addresses these needs by offering infrastructure to train, deploy, and scale generative AI models.
This review examines cloud services for generative AI, focusing on key providers like Amazon Web Services (AWS), Microsoft Azure, Google Cloud, IBM Cloud, Oracle Cloud, and Alibaba Cloud.
arXiv Detail & Related papers (2024-12-08T19:49:07Z) - Seamless Optical Cloud Computing across Edge-Metro Network for Generative AI [11.50609298355243]
We propose and experimentally demonstrate an optical cloud computing system that can be seamlessly deployed across edge-metro network.
By modulating inputs and models into light, a wide range of edge nodes can directly access the optical computing center via the edge-metro network.
The experimental validations show an energy efficiency of 118.6 mW/TOPs (tera operations per second), reducing energy consumption by two orders of magnitude compared to traditional electronic-based cloud computing solutions.
arXiv Detail & Related papers (2024-12-04T11:49:13Z) - Two-Timescale Model Caching and Resource Allocation for Edge-Enabled AI-Generated Content Services [55.0337199834612]
Generative AI (GenAI) has emerged as a transformative technology, enabling customized and personalized AI-generated content (AIGC) services.
These services require executing GenAI models with billions of parameters, posing significant obstacles to resource-limited wireless edge.
We introduce the formulation of joint model caching and resource allocation for AIGC services to balance a trade-off between AIGC quality and latency metrics.
arXiv Detail & Related papers (2024-11-03T07:01:13Z) - AI-Generated Images as Data Source: The Dawn of Synthetic Era [61.879821573066216]
generative AI has unlocked the potential to create synthetic images that closely resemble real-world photographs.
This paper explores the innovative concept of harnessing these AI-generated images as new data sources.
In contrast to real data, AI-generated data exhibit remarkable advantages, including unmatched abundance and scalability.
arXiv Detail & Related papers (2023-10-03T06:55:19Z) - SoTaNa: The Open-Source Software Development Assistant [81.86136560157266]
SoTaNa is an open-source software development assistant.
It generates high-quality instruction-based data for the domain of software engineering.
It employs a parameter-efficient fine-tuning approach to enhance the open-source foundation model, LLaMA.
arXiv Detail & Related papers (2023-08-25T14:56:21Z) - An Overview on Generative AI at Scale with Edge-Cloud Computing [28.98486923400986]
generative artificial intelligence (GenAI) generates new content that resembles what is created by humans.
The rapid development of GenAI systems has created a huge amount of new data on the Internet.
It is attractive to build GenAI systems at scale by leveraging the edge-cloud computing paradigm.
arXiv Detail & Related papers (2023-06-02T06:24:15Z) - Scalable, Distributed AI Frameworks: Leveraging Cloud Computing for
Enhanced Deep Learning Performance and Efficiency [0.0]
In recent years, the integration of artificial intelligence (AI) and cloud computing has emerged as a promising avenue for addressing the growing computational demands of AI applications.
This paper presents a comprehensive study of scalable, distributed AI frameworks leveraging cloud computing for enhanced deep learning performance and efficiency.
arXiv Detail & Related papers (2023-04-26T15:38:00Z) - Edge-Cloud Polarization and Collaboration: A Comprehensive Survey [61.05059817550049]
We conduct a systematic review for both cloud and edge AI.
We are the first to set up the collaborative learning mechanism for cloud and edge modeling.
We discuss potentials and practical experiences of some on-going advanced edge AI topics.
arXiv Detail & Related papers (2021-11-11T05:58:23Z) - Auto-Split: A General Framework of Collaborative Edge-Cloud AI [49.750972428032355]
This paper describes the techniques and engineering practice behind Auto-Split, an edge-cloud collaborative prototype of Huawei Cloud.
To the best of our knowledge, there is no existing industry product that provides the capability of Deep Neural Network (DNN) splitting.
arXiv Detail & Related papers (2021-08-30T08:03:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.