Scalable Cosmic AI Inference using Cloud Serverless Computing with FMI
- URL: http://arxiv.org/abs/2501.06249v2
- Date: Sun, 09 Feb 2025 14:54:24 GMT
- Title: Scalable Cosmic AI Inference using Cloud Serverless Computing with FMI
- Authors: Mills Staylor, Amirreza Dolatpour Fathkouhi, Md Khairul Islam, Kaleigh O'Hara, Ryan Ghiles Goudjil, Geoffrey Fox, Judy Fox,
- Abstract summary: Large-scale astronomical image data processing and prediction is essential for astronomers.<n>Modern deep learning models offer high predictive accuracy, but they often demand substantial computational resources.<n>We introduce the Cloud-based Astronomy Inference framework to address these challenges.
- Score: 0.35337216626844875
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Large-scale astronomical image data processing and prediction is essential for astronomers, providing crucial insights into celestial objects, the universe's history, and its evolution. While modern deep learning models offer high predictive accuracy, they often demand substantial computational resources, making them resource-intensive and limiting accessibility. We introduce the Cloud-based Astronomy Inference (CAI) framework to address these challenges. This scalable solution integrates pre-trained foundation models with serverless cloud infrastructure through a Function-as-a-Service (FaaS) Message Interface (FMI). CAI enables efficient and scalable inference on astronomical images without extensive hardware. Using a foundation model for redshift prediction as a case study, our extensive experiments cover user devices, HPC (High-Performance Computing) servers, and Cloud. CAI's significant scalability improvement on large data sizes provides an accessible and effective tool for the astronomy community. The code is accessible at https://github.com/UVA-MLSys/AI-for-Astronomy.
Related papers
- Towards a future space-based, highly scalable AI infrastructure system design [1.6764108000740328]
This work explores a scalable compute system for machine learning in space.<n>It uses fleets of satellites equipped with solar arrays, inter-satellite links using free-space optics, and Google tensor processing unit (TPU) accelerator chips.<n>We illustrate the basic approach to formation flight via a 81-satellite cluster of 1 km radius, and describe an approach for using high-precision ML-based models to control large-scale constellations.
arXiv Detail & Related papers (2025-11-22T00:28:08Z) - AstroMAE: Redshift Prediction Using a Masked Autoencoder with a Novel Fine-Tuning Architecture [0.6906005491572401]
We introduce AstroMAE, an innovative approach that pretrains a vision transformer encoder using a masked autoencoder method.
This technique enables the encoder to capture the global patterns within the data without relying on labels.
We evaluate our model against various vision transformer architectures and CNN-based models.
arXiv Detail & Related papers (2024-09-03T12:12:37Z) - NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals [58.83169560132308]
We introduce NNsight and NDIF, technologies that work in tandem to enable scientific study of very large neural networks.<n>NNsight is an open-source system that extends PyTorch to introduce deferred remote execution.<n>NDIF is a scalable inference service that executes NNsight requests, allowing users to share GPU resources and pretrained models.
arXiv Detail & Related papers (2024-07-18T17:59:01Z) - Distributed computing for physics-based data-driven reduced modeling at scale: Application to a rotating detonation rocket engine [0.0]
This paper contributes a distributed memory algorithm that achieves fast and scalable construction of predictive physics-based ROMs trained from sparse datasets of extremely large state dimension.<n>We focus on a real-world three-dimensional RDRE for which one millisecond of simulated physical time requires one million core hours on a supercomputer.
arXiv Detail & Related papers (2024-07-13T20:17:41Z) - Computing in the Era of Large Generative Models: From Cloud-Native to
AI-Native [46.7766555589807]
We describe an AI-native computing paradigm that harnesses the power of both cloudnative technologies and advanced machine learning inference.
These joint efforts aim to optimize costs-of-goods-sold (COGS) and improve resource accessibility.
arXiv Detail & Related papers (2024-01-17T20:34:11Z) - Self-Driving Telescopes: Autonomous Scheduling of Astronomical
Observation Campaigns with Offline Reinforcement Learning [0.6976905094072174]
We use simulated data to test and compare multiple implementations of a Deep Q-Network (DQN) for the task of optimizing the schedule of observations from the Stone Edge Observatory (SEO)
We show that DQNs can achieve an average reward of 87%+-6% of the maximum achievable reward in each state on the test set.
This is the first comparison of offline RL algorithms for a particular astronomical challenge and the first open-source framework for performing such a comparison and assessment task.
arXiv Detail & Related papers (2023-11-29T21:23:30Z) - Pushing the Limits of Pre-training for Time Series Forecasting in the
CloudOps Domain [54.67888148566323]
We introduce three large-scale time series forecasting datasets from the cloud operations domain.
We show it is a strong zero-shot baseline and benefits from further scaling, both in model and dataset size.
Accompanying these datasets and results is a suite of comprehensive benchmark results comparing classical and deep learning baselines to our pre-trained method.
arXiv Detail & Related papers (2023-10-08T08:09:51Z) - Federated Fine-Tuning of LLMs on the Very Edge: The Good, the Bad, the Ugly [62.473245910234304]
This paper takes a hardware-centric approach to explore how Large Language Models can be brought to modern edge computing systems.
We provide a micro-level hardware benchmark, compare the model FLOP utilization to a state-of-the-art data center GPU, and study the network utilization in realistic conditions.
arXiv Detail & Related papers (2023-10-04T20:27:20Z) - The Tiny Time-series Transformer: Low-latency High-throughput
Classification of Astronomical Transients using Deep Model Compression [4.960046610835999]
The upcoming Legacy Survey of Space and Time (LSST) will raise the big-data bar for time-domain astronomy.
We show how the use of modern deep compression methods can achieve a $18times$ reduction in model size.
We also show that in addition to the deep compression techniques, careful choice of file formats can improve inference latency.
arXiv Detail & Related papers (2023-03-15T21:46:35Z) - Satellite Image Time Series Analysis for Big Earth Observation Data [50.591267188664666]
This paper describes sits, an open-source R package for satellite image time series analysis using machine learning.
We show that this approach produces high accuracy for land use and land cover maps through a case study in the Cerrado biome.
arXiv Detail & Related papers (2022-04-24T15:23:25Z) - The MIT Supercloud Workload Classification Challenge [10.458111248130944]
In this paper, we present a workload classification challenge based on the MIT Supercloud dataset.
The goal of this challenge is to foster algorithmic innovations in the analysis of compute workloads.
arXiv Detail & Related papers (2022-04-12T14:28:04Z) - Deep Learning for Real Time Satellite Pose Estimation on Low Power Edge
TPU [58.720142291102135]
In this paper we propose a pose estimation software exploiting neural network architectures.
We show how low power machine learning accelerators could enable Artificial Intelligence exploitation in space.
arXiv Detail & Related papers (2022-04-07T08:53:18Z) - The MIT Supercloud Dataset [3.375826083518709]
We introduce the MIT Supercloud dataset which aims to foster innovative AI/ML approaches to the analysis of large scale HPC and datacenter/cloud operations.
We provide detailed monitoring logs from the MIT Supercloud system, which include CPU and GPU usage by jobs, memory usage, file system logs, and physical monitoring data.
This paper discusses the details of the dataset, collection methodology, data availability, and discusses potential challenge problems being developed using this data.
arXiv Detail & Related papers (2021-08-04T13:06:17Z) - First Full-Event Reconstruction from Imaging Atmospheric Cherenkov
Telescope Real Data with Deep Learning [55.41644538483948]
The Cherenkov Telescope Array is the future of ground-based gamma-ray astronomy.
Its first prototype telescope built on-site, the Large Size Telescope 1, is currently under commissioning and taking its first scientific data.
We present for the first time the development of a full-event reconstruction based on deep convolutional neural networks and its application to real data.
arXiv Detail & Related papers (2021-05-31T12:51:42Z) - Cost-effective Machine Learning Inference Offload for Edge Computing [0.3149883354098941]
This paper proposes a novel offloading mechanism by leveraging installed-base on-premises (edge) computational resources.
The proposed mechanism allows the edge devices to offload heavy and compute-intensive workloads to edge nodes instead of using remote cloud.
arXiv Detail & Related papers (2020-12-07T21:11:02Z) - Lightweight Single-Image Super-Resolution Network with Attentive
Auxiliary Feature Learning [73.75457731689858]
We develop a computation efficient yet accurate network based on the proposed attentive auxiliary features (A$2$F) for SISR.
Experimental results on large-scale dataset demonstrate the effectiveness of the proposed model against the state-of-the-art (SOTA) SR methods.
arXiv Detail & Related papers (2020-11-13T06:01:46Z) - Superiority of Simplicity: A Lightweight Model for Network Device
Workload Prediction [58.98112070128482]
We propose a lightweight solution for series prediction based on historic observations.
It consists of a heterogeneous ensemble method composed of two models - a neural network and a mean predictor.
It achieves an overall $R2$ score of 0.10 on the available FedCSIS 2020 challenge dataset.
arXiv Detail & Related papers (2020-07-07T15:44:16Z) - A Privacy-Preserving Distributed Architecture for
Deep-Learning-as-a-Service [68.84245063902908]
This paper introduces a novel distributed architecture for deep-learning-as-a-service.
It is able to preserve the user sensitive data while providing Cloud-based machine and deep learning services.
arXiv Detail & Related papers (2020-03-30T15:12:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.