Scalable Deployment of AI Time-series Models for IoT
- URL: http://arxiv.org/abs/2003.12141v1
- Date: Tue, 24 Mar 2020 14:27:25 GMT
- Title: Scalable Deployment of AI Time-series Models for IoT
- Authors: Bradley Eck, Francesco Fusco, Robert Gormally, Mark Purcell, Seshu
Tirupathi
- Abstract summary: IBM Research Castor is a cloud-native system for managing and deploying time-series models in IoT applications.
Model templates can be deployed against specific instances of semantic concepts.
Results from deployments in real-world smartgrid live forecasting applications are reported.
- Score: 0.7169734491710924
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: IBM Research Castor, a cloud-native system for managing and deploying large
numbers of AI time-series models in IoT applications, is described. Modelling
code templates, in Python and R, following a typical machine-learning workflow
are supported. A knowledge-based approach to managing model and time-series
data allows the use of general semantic concepts for expressing feature
engineering tasks. Model templates can be programmatically deployed against
specific instances of semantic concepts, thus supporting model reuse and
automated replication as the IoT application grows. Deployed models are
automatically executed in parallel leveraging a serverless cloud computing
framework. The complete history of trained model versions and rolling-horizon
predictions is persisted, thus enabling full model lineage and traceability.
Results from deployments in real-world smart-grid live forecasting applications
are reported. Scalability of executing up to tens of thousands of AI modelling
tasks is also evaluated.
Related papers
- Model Callers for Transforming Predictive and Generative AI Applications [2.7195102129095003]
We introduce a novel software abstraction termed "model caller"
Model callers act as an intermediary for AI and ML model calling.
We have released a prototype Python library for model callers, accessible for installation via pip or for download from GitHub.
arXiv Detail & Related papers (2024-04-17T12:21:06Z) - Reusable MLOps: Reusable Deployment, Reusable Infrastructure and
Hot-Swappable Machine Learning models and services [0.0]
We introduce a new sustainable concept in the field of AI/ML operations - called Reusable MLOps.
We reuse the existing deployment and infrastructure to serve new models by hot-swapping them without tearing down the infrastructure or the microservice.
arXiv Detail & Related papers (2024-02-19T23:40:46Z) - Model Share AI: An Integrated Toolkit for Collaborative Machine Learning
Model Development, Provenance Tracking, and Deployment in Python [0.0]
We introduce Model Share AI (AIMS), an easy-to-use MLOps platform designed to streamline collaborative model development, model provenance tracking, and model deployment.
AIMS features collaborative project spaces and a standardized model evaluation process that ranks model submissions based on their performance on unseen evaluation data.
AIMS allows users to deploy ML models built in Scikit-Learn, Keras, PyTorch, and ONNX into live REST APIs and automatically generated web apps.
arXiv Detail & Related papers (2023-09-27T15:24:39Z) - Prompt2Model: Generating Deployable Models from Natural Language
Instructions [74.19816829003729]
Large language models (LLMs) enable system builders to create competent NLP systems through prompting.
In other ways, LLMs are a step backward from traditional special-purpose NLP models.
We propose Prompt2Model, a general-purpose method that takes a natural language task description like the prompts provided to LLMs.
arXiv Detail & Related papers (2023-08-23T17:28:21Z) - Predicting Resource Consumption of Kubernetes Container Systems using
Resource Models [3.138731415322007]
This paper considers how to derive resource models for cloud systems empirically.
We do so based on models of deployed services in a formal language with explicit adherence to CPU and memory resources.
We report on leveraging data collected empirically from small deployments to simulate the execution of higher intensity scenarios on larger deployments.
arXiv Detail & Related papers (2023-05-12T17:59:01Z) - A new hope for network model generalization [66.5377859849467]
Generalizing machine learning models for network traffic dynamics tends to be considered a lost cause.
An ML architecture called_Transformer_ has enabled previously unimaginable generalization in other domains.
We propose a Network Traffic Transformer (NTT) to learn network dynamics from packet traces.
arXiv Detail & Related papers (2022-07-12T21:16:38Z) - Model Reprogramming: Resource-Efficient Cross-Domain Machine Learning [65.268245109828]
In data-rich domains such as vision, language, and speech, deep learning prevails to deliver high-performance task-specific models.
Deep learning in resource-limited domains still faces multiple challenges including (i) limited data, (ii) constrained model development cost, and (iii) lack of adequate pre-trained models for effective finetuning.
Model reprogramming enables resource-efficient cross-domain machine learning by repurposing a well-developed pre-trained model from a source domain to solve tasks in a target domain without model finetuning.
arXiv Detail & Related papers (2022-02-22T02:33:54Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - Merlion: A Machine Learning Library for Time Series [73.46386700728577]
Merlion is an open-source machine learning library for time series.
It features a unified interface for models and datasets for anomaly detection and forecasting.
Merlion also provides a unique evaluation framework that simulates the live deployment and re-training of a model in production.
arXiv Detail & Related papers (2021-09-20T02:03:43Z) - FlexServe: Deployment of PyTorch Models as Flexible REST Endpoints [6.730473762151365]
integration of artificial intelligence capabilities into modern software systems is increasingly being simplified through the use of cloud-based services and representational state transfer architecture.
Insufficient information regarding underlying model provenance and the lack of control over model evolution serve as an impediment to the more widespread adoption of these services in many operational environments which have strict security requirements.
arXiv Detail & Related papers (2020-02-29T18:51:09Z) - Model Reuse with Reduced Kernel Mean Embedding Specification [70.044322798187]
We present a two-phase framework for finding helpful models for a current application.
In the upload phase, when a model is uploading into the pool, we construct a reduced kernel mean embedding (RKME) as a specification for the model.
Then in the deployment phase, the relatedness of the current task and pre-trained models will be measured based on the value of the RKME specification.
arXiv Detail & Related papers (2020-01-20T15:15:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.