Computing in the Era of Large Generative Models: From Cloud-Native to
AI-Native
- URL: http://arxiv.org/abs/2401.12230v1
- Date: Wed, 17 Jan 2024 20:34:11 GMT
- Title: Computing in the Era of Large Generative Models: From Cloud-Native to
AI-Native
- Authors: Yao Lu, Song Bian, Lequn Chen, Yongjun He, Yulong Hui, Matthew Lentz,
Beibin Li, Fei Liu, Jialin Li, Qi Liu, Rui Liu, Xiaoxuan Liu, Lin Ma, Kexin
Rong, Jianguo Wang, Yingjun Wu, Yongji Wu, Huanchen Zhang, Minjia Zhang,
Qizhen Zhang, Tianyi Zhou, Danyang Zhuo
- Abstract summary: We describe an AI-native computing paradigm that harnesses the power of both cloudnative technologies and advanced machine learning inference.
These joint efforts aim to optimize costs-of-goods-sold (COGS) and improve resource accessibility.
- Score: 46.7766555589807
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we investigate the intersection of large generative AI models
and cloud-native computing architectures. Recent large models such as ChatGPT,
while revolutionary in their capabilities, face challenges like escalating
costs and demand for high-end GPUs. Drawing analogies between
large-model-as-a-service (LMaaS) and cloud database-as-a-service (DBaaS), we
describe an AI-native computing paradigm that harnesses the power of both
cloud-native technologies (e.g., multi-tenancy and serverless computing) and
advanced machine learning runtime (e.g., batched LoRA inference). These joint
efforts aim to optimize costs-of-goods-sold (COGS) and improve resource
accessibility. The journey of merging these two domains is just at the
beginning and we hope to stimulate future research and development in this
area.
Related papers
- Two-Timescale Model Caching and Resource Allocation for Edge-Enabled AI-Generated Content Services [55.0337199834612]
Generative AI (GenAI) has emerged as a transformative technology, enabling customized and personalized AI-generated content (AIGC) services.
These services require executing GenAI models with billions of parameters, posing significant obstacles to resource-limited wireless edge.
We introduce the formulation of joint model caching and resource allocation for AIGC services to balance a trade-off between AIGC quality and latency metrics.
arXiv Detail & Related papers (2024-11-03T07:01:13Z) - AI-Generated Images as Data Source: The Dawn of Synthetic Era [61.879821573066216]
generative AI has unlocked the potential to create synthetic images that closely resemble real-world photographs.
This paper explores the innovative concept of harnessing these AI-generated images as new data sources.
In contrast to real data, AI-generated data exhibit remarkable advantages, including unmatched abundance and scalability.
arXiv Detail & Related papers (2023-10-03T06:55:19Z) - SoTaNa: The Open-Source Software Development Assistant [81.86136560157266]
SoTaNa is an open-source software development assistant.
It generates high-quality instruction-based data for the domain of software engineering.
It employs a parameter-efficient fine-tuning approach to enhance the open-source foundation model, LLaMA.
arXiv Detail & Related papers (2023-08-25T14:56:21Z) - An Overview on Generative AI at Scale with Edge-Cloud Computing [28.98486923400986]
generative artificial intelligence (GenAI) generates new content that resembles what is created by humans.
The rapid development of GenAI systems has created a huge amount of new data on the Internet.
It is attractive to build GenAI systems at scale by leveraging the edge-cloud computing paradigm.
arXiv Detail & Related papers (2023-06-02T06:24:15Z) - Scalable, Distributed AI Frameworks: Leveraging Cloud Computing for
Enhanced Deep Learning Performance and Efficiency [0.0]
In recent years, the integration of artificial intelligence (AI) and cloud computing has emerged as a promising avenue for addressing the growing computational demands of AI applications.
This paper presents a comprehensive study of scalable, distributed AI frameworks leveraging cloud computing for enhanced deep learning performance and efficiency.
arXiv Detail & Related papers (2023-04-26T15:38:00Z) - Molecular Dynamics Simulations on Cloud Computing and Machine Learning
Platforms [0.8093262393618671]
We see a paradigm shift in the computational structure, design, and requirements of scientific computing applications.
Data-driven and machine learning approaches are being used to support, speed-up, and enhance scientific computing applications.
Cloud computing platforms are increasingly appealing for scientific computing.
arXiv Detail & Related papers (2021-11-11T21:20:26Z) - Edge-Cloud Polarization and Collaboration: A Comprehensive Survey [61.05059817550049]
We conduct a systematic review for both cloud and edge AI.
We are the first to set up the collaborative learning mechanism for cloud and edge modeling.
We discuss potentials and practical experiences of some on-going advanced edge AI topics.
arXiv Detail & Related papers (2021-11-11T05:58:23Z) - Auto-Split: A General Framework of Collaborative Edge-Cloud AI [49.750972428032355]
This paper describes the techniques and engineering practice behind Auto-Split, an edge-cloud collaborative prototype of Huawei Cloud.
To the best of our knowledge, there is no existing industry product that provides the capability of Deep Neural Network (DNN) splitting.
arXiv Detail & Related papers (2021-08-30T08:03:29Z) - The MIT Supercloud Dataset [3.375826083518709]
We introduce the MIT Supercloud dataset which aims to foster innovative AI/ML approaches to the analysis of large scale HPC and datacenter/cloud operations.
We provide detailed monitoring logs from the MIT Supercloud system, which include CPU and GPU usage by jobs, memory usage, file system logs, and physical monitoring data.
This paper discusses the details of the dataset, collection methodology, data availability, and discusses potential challenge problems being developed using this data.
arXiv Detail & Related papers (2021-08-04T13:06:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.