AI Factories: It's time to rethink the Cloud-HPC divide
- URL: http://arxiv.org/abs/2509.12849v1
- Date: Tue, 16 Sep 2025 09:08:05 GMT
- Title: AI Factories: It's time to rethink the Cloud-HPC divide
- Authors: Pedro Garcia Lopez, Daniel Barcelona Pons, Marcin Copik, Torsten Hoefler, Eduardo QuiƱones, Maciej Malawski, Peter Pietzutch, Alberto Marti, Thomas Ohlson Timoudas, Aleksander Slominski,
- Abstract summary: This article advocates for a dual-stack approach within supercomputers: integrating both HPC and cloud-native technologies.<n>Our goal is to bridge the divide between HPC and cloud computing by combining high performance and hardware acceleration with ease of use and service-oriented front-ends.
- Score: 43.712212475089224
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The strategic importance of artificial intelligence is driving a global push toward Sovereign AI initiatives. Nationwide governments are increasingly developing dedicated infrastructures, called AI Factories (AIF), to achieve technological autonomy and secure the resources necessary to sustain robust local digital ecosystems. In Europe, the EuroHPC Joint Undertaking is investing hundreds of millions of euros into several AI Factories, built atop existing high-performance computing (HPC) supercomputers. However, while HPC systems excel in raw performance, they are not inherently designed for usability, accessibility, or serving as public-facing platforms for AI services such as inference or agentic applications. In contrast, AI practitioners are accustomed to cloud-native technologies like Kubernetes and object storage, tools that are often difficult to integrate within traditional HPC environments. This article advocates for a dual-stack approach within supercomputers: integrating both HPC and cloud-native technologies. Our goal is to bridge the divide between HPC and cloud computing by combining high performance and hardware acceleration with ease of use and service-oriented front-ends. This convergence allows each paradigm to amplify the other. To this end, we will study the cloud challenges of HPC (Serverless HPC) and the HPC challenges of cloud technologies (High-performance Cloud).
Related papers
- AI+HW 2035: Shaping the Next Decade [135.53570243498987]
Artificial intelligence (AI) and hardware (HW) are advancing at unprecedented rates, yet their trajectories have become inseparably intertwined.<n>This vision paper lays out a 10-year roadmap for AI+HW co-design and co-development, spanning algorithms, architectures, systems, and sustainability.<n>We identify key challenges and opportunities, candidly assess potential obstacles and pitfalls, and propose integrated solutions.
arXiv Detail & Related papers (2026-03-05T14:36:33Z) - Transforming the Hybrid Cloud for Emerging AI Workloads [82.21522417363666]
This white paper envisions transforming hybrid cloud systems to meet the growing complexity of AI workloads.<n>The proposed framework addresses critical challenges in energy efficiency, performance, and cost-effectiveness.<n>This joint initiative aims to establish hybrid clouds as secure, efficient, and sustainable platforms.
arXiv Detail & Related papers (2024-11-20T11:57:43Z) - AI-Driven Innovations in Modern Cloud Computing [2.3931689873603594]
This paper explores how AI and cloud computing intersect to deliver transformative capabilities for modernizing applications.
Harnessing the combined potential of both AI & Cloud technologies, technology providers can now exploit intelligent resource management, predictive analytics, automated deployment & scaling.
arXiv Detail & Related papers (2024-10-21T12:45:10Z) - Computing in the Era of Large Generative Models: From Cloud-Native to
AI-Native [46.7766555589807]
We describe an AI-native computing paradigm that harnesses the power of both cloudnative technologies and advanced machine learning inference.
These joint efforts aim to optimize costs-of-goods-sold (COGS) and improve resource accessibility.
arXiv Detail & Related papers (2024-01-17T20:34:11Z) - Green Edge AI: A Contemporary Survey [46.11332733210337]
The transformative power of AI is derived from the utilization of deep neural networks (DNNs)
Deep learning (DL) is increasingly being transitioned to wireless edge networks in proximity to end-user devices (EUDs)
Despite its potential, edge AI faces substantial challenges, mostly due to the dichotomy between the resource limitations of wireless edge networks and the resource-intensive nature of DL.
arXiv Detail & Related papers (2023-12-01T04:04:37Z) - One nine availability of a Photonic Quantum Computer on the Cloud toward
HPC integration [0.8961191069175432]
In November 2022, we introduced the first cloud-accessible general-purpose quantum computer based on single photons.
We describe the design and implementation of our cloud-accessible quantum computing platform, and demonstrate one nine availability (92 for external users during a six-month period, higher than most online services)
This work lay the foundation for advancing quantum computing accessibility and usability in hybrid HPC-QC infrastructures.
arXiv Detail & Related papers (2023-08-28T13:47:39Z) - Future Computer Systems and Networking Research in the Netherlands: A
Manifesto [137.47124933818066]
We draw attention to CompSys as a vital part of ICT.
Each of the Top Sectors of the Dutch Economy, each route in the National Research Agenda, and each of the UN Sustainable Development Goals pose challenges that cannot be addressed without CompSys advances.
arXiv Detail & Related papers (2022-05-26T11:02:29Z) - Edge-Cloud Polarization and Collaboration: A Comprehensive Survey [61.05059817550049]
We conduct a systematic review for both cloud and edge AI.
We are the first to set up the collaborative learning mechanism for cloud and edge modeling.
We discuss potentials and practical experiences of some on-going advanced edge AI topics.
arXiv Detail & Related papers (2021-11-11T05:58:23Z) - Artificial Intelligence at the Edge [25.451110446336276]
5G mobile communication networks increase communication capacity, reduce transmission latency and error, and save energy.
The envisioned future 6G technology will integrate many more technologies, including for example visible light communication.
Many applications require computations and analytics close to application end-points: that is, at the edge of the network, rather than in a centralized cloud.
arXiv Detail & Related papers (2020-12-10T02:08:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.