Related papers: Scalable Deployment of AI Time-series Models for IoT

Scalable Deployment of AI Time-series Models for IoT

URL: http://arxiv.org/abs/2003.12141v1
Date: Tue, 24 Mar 2020 14:27:25 GMT
Title: Scalable Deployment of AI Time-series Models for IoT
Authors: Bradley Eck, Francesco Fusco, Robert Gormally, Mark Purcell, Seshu Tirupathi
Abstract summary: IBM Research Castor is a cloud-native system for managing and deploying time-series models in IoT applications. Model templates can be deployed against specific instances of semantic concepts. Results from deployments in real-world smartgrid live forecasting applications are reported.
Score: 0.7169734491710924
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: IBM Research Castor, a cloud-native system for managing and deploying large numbers of AI time-series models in IoT applications, is described. Modelling code templates, in Python and R, following a typical machine-learning workflow are supported. A knowledge-based approach to managing model and time-series data allows the use of general semantic concepts for expressing feature engineering tasks. Model templates can be programmatically deployed against specific instances of semantic concepts, thus supporting model reuse and automated replication as the IoT application grows. Deployed models are automatically executed in parallel leveraging a serverless cloud computing framework. The complete history of trained model versions and rolling-horizon predictions is persisted, thus enabling full model lineage and traceability. Results from deployments in real-world smart-grid live forecasting applications are reported. Scalability of executing up to tens of thousands of AI modelling tasks is also evaluated.

Related papers

UniSTD: Towards Unified Spatio-Temporal Learning across Diverse Disciplines [64.84631333071728]
We introduce bfUnistage, a unified Transformer-based framework fortemporal modeling. Our work demonstrates that a task-specific vision-text can build a generalizable model fortemporal learning. We also introduce a temporal module to incorporate temporal dynamics explicitly.
arXiv Detail & Related papers (2025-03-26T17:33:23Z)
Model Callers for Transforming Predictive and Generative AI Applications [2.7195102129095003]
We introduce a novel software abstraction termed "model caller" Model callers act as an intermediary for AI and ML model calling. We have released a prototype Python library for model callers, accessible for installation via pip or for download from GitHub.
arXiv Detail & Related papers (2024-04-17T12:21:06Z)
Reusable MLOps: Reusable Deployment, Reusable Infrastructure and Hot-Swappable Machine Learning models and services [0.0]
We introduce a new sustainable concept in the field of AI/ML operations - called Reusable MLOps. We reuse the existing deployment and infrastructure to serve new models by hot-swapping them without tearing down the infrastructure or the microservice.
arXiv Detail & Related papers (2024-02-19T23:40:46Z)
Model Share AI: An Integrated Toolkit for Collaborative Machine Learning Model Development, Provenance Tracking, and Deployment in Python [0.0]
We introduce Model Share AI (AIMS), an easy-to-use MLOps platform designed to streamline collaborative model development, model provenance tracking, and model deployment. AIMS features collaborative project spaces and a standardized model evaluation process that ranks model submissions based on their performance on unseen evaluation data. AIMS allows users to deploy ML models built in Scikit-Learn, Keras, PyTorch, and ONNX into live REST APIs and automatically generated web apps.
arXiv Detail & Related papers (2023-09-27T15:24:39Z)
Prompt2Model: Generating Deployable Models from Natural Language Instructions [74.19816829003729]
Large language models (LLMs) enable system builders to create competent NLP systems through prompting. In other ways, LLMs are a step backward from traditional special-purpose NLP models. We propose Prompt2Model, a general-purpose method that takes a natural language task description like the prompts provided to LLMs.
arXiv Detail & Related papers (2023-08-23T17:28:21Z)
Predicting Resource Consumption of Kubernetes Container Systems using Resource Models [3.138731415322007]
This paper considers how to derive resource models for cloud systems empirically. We do so based on models of deployed services in a formal language with explicit adherence to CPU and memory resources. We report on leveraging data collected empirically from small deployments to simulate the execution of higher intensity scenarios on larger deployments.
arXiv Detail & Related papers (2023-05-12T17:59:01Z)
A new hope for network model generalization [66.5377859849467]
Generalizing machine learning models for network traffic dynamics tends to be considered a lost cause. An ML architecture called_Transformer_ has enabled previously unimaginable generalization in other domains. We propose a Network Traffic Transformer (NTT) to learn network dynamics from packet traces.
arXiv Detail & Related papers (2022-07-12T21:16:38Z)
Model Reprogramming: Resource-Efficient Cross-Domain Machine Learning [65.268245109828]
In data-rich domains such as vision, language, and speech, deep learning prevails to deliver high-performance task-specific models. Deep learning in resource-limited domains still faces multiple challenges including (i) limited data, (ii) constrained model development cost, and (iii) lack of adequate pre-trained models for effective finetuning. Model reprogramming enables resource-efficient cross-domain machine learning by repurposing a well-developed pre-trained model from a source domain to solve tasks in a target domain without model finetuning.
arXiv Detail & Related papers (2022-02-22T02:33:54Z)
SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines. This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z)
Merlion: A Machine Learning Library for Time Series [73.46386700728577]
Merlion is an open-source machine learning library for time series. It features a unified interface for models and datasets for anomaly detection and forecasting. Merlion also provides a unique evaluation framework that simulates the live deployment and re-training of a model in production.
arXiv Detail & Related papers (2021-09-20T02:03:43Z)
FlexServe: Deployment of PyTorch Models as Flexible REST Endpoints [6.730473762151365]
integration of artificial intelligence capabilities into modern software systems is increasingly being simplified through the use of cloud-based services and representational state transfer architecture. Insufficient information regarding underlying model provenance and the lack of control over model evolution serve as an impediment to the more widespread adoption of these services in many operational environments which have strict security requirements.
arXiv Detail & Related papers (2020-02-29T18:51:09Z)
Model Reuse with Reduced Kernel Mean Embedding Specification [70.044322798187]
We present a two-phase framework for finding helpful models for a current application. In the upload phase, when a model is uploading into the pool, we construct a reduced kernel mean embedding (RKME) as a specification for the model. Then in the deployment phase, the relatedness of the current task and pre-trained models will be measured based on the value of the RKME specification.
arXiv Detail & Related papers (2020-01-20T15:15:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.