SensiX++: Bringing MLOPs and Multi-tenant Model Serving to Sensory Edge
Devices
- URL: http://arxiv.org/abs/2109.03947v1
- Date: Wed, 8 Sep 2021 22:06:16 GMT
- Title: SensiX++: Bringing MLOPs and Multi-tenant Model Serving to Sensory Edge
Devices
- Authors: Chulhong Min, Akhil Mathur, Utku Gunay Acer, Alessandro Montanari,
Fahim Kawsar
- Abstract summary: We present a multi-tenant runtime for adaptive model execution with integrated MLOps on edge devices, e.g., a camera, a microphone, or IoT sensors.
S SensiX++ operates on two fundamental principles - highly modular componentisation to externalise data operations with clear abstractions and document-centric manifestation for system-wide orchestration.
We report on the overall throughput and quantified benefits of various automation components of SensiX++ and demonstrate its efficacy to significantly reduce operational complexity and lower the effort to deploy, upgrade, reconfigure and serve embedded models on edge devices.
- Score: 69.1412199244903
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We present SensiX++ - a multi-tenant runtime for adaptive model execution
with integrated MLOps on edge devices, e.g., a camera, a microphone, or IoT
sensors. SensiX++ operates on two fundamental principles - highly modular
componentisation to externalise data operations with clear abstractions and
document-centric manifestation for system-wide orchestration. First, a data
coordinator manages the lifecycle of sensors and serves models with correct
data through automated transformations. Next, a resource-aware model server
executes multiple models in isolation through model abstraction, pipeline
automation and feature sharing. An adaptive scheduler then orchestrates the
best-effort executions of multiple models across heterogeneous accelerators,
balancing latency and throughput. Finally, microservices with REST APIs serve
synthesised model predictions, system statistics, and continuous deployment.
Collectively, these components enable SensiX++ to serve multiple models
efficiently with fine-grained control on edge devices while minimising data
operation redundancy, managing data and device heterogeneity, reducing resource
contention and removing manual MLOps. We benchmark SensiX++ with ten different
vision and acoustics models across various multi-tenant configurations on
different edge accelerators (Jetson AGX and Coral TPU) designed for sensory
devices. We report on the overall throughput and quantified benefits of various
automation components of SensiX++ and demonstrate its efficacy to significantly
reduce operational complexity and lower the effort to deploy, upgrade,
reconfigure and serve embedded models on edge devices.
Related papers
- OminiControl: Minimal and Universal Control for Diffusion Transformer [68.3243031301164]
OminiControl is a framework that integrates image conditions into pre-trained Diffusion Transformer (DiT) models.
At its core, OminiControl leverages a parameter reuse mechanism, enabling the DiT to encode image conditions using itself as a powerful backbone.
OminiControl addresses a wide range of image conditioning tasks in a unified manner, including subject-driven generation and spatially-aligned conditions.
arXiv Detail & Related papers (2024-11-22T17:55:15Z) - Backpropagation-Free Multi-modal On-Device Model Adaptation via Cloud-Device Collaboration [37.456185990843515]
We introduce a Universal On-Device Multi-modal Model Adaptation Framework.
The framework features the Fast Domain Adaptor (FDA) hosted in the cloud, providing tailored parameters for the Lightweight Multi-modal Model on devices.
Our contributions represent a pioneering solution for on-Device Multi-modal Model Adaptation (DMMA)
arXiv Detail & Related papers (2024-05-21T14:42:18Z) - MultiTASC: A Multi-Tenancy-Aware Scheduler for Cascaded DNN Inference at
the Consumer Edge [4.281723404774888]
This work presents MultiTASC, a multi-tenancy-aware scheduler that adaptively controls the decision functions of devices.
By explicitly considering device forwarding, our scheduler improves the latency service-level objective (SLO) satisfaction rate by 20-25 percentage points (pp) over state-of-the-art cascade methods.
arXiv Detail & Related papers (2023-06-22T12:04:49Z) - Asynchronous Multi-Model Dynamic Federated Learning over Wireless
Networks: Theory, Modeling, and Optimization [20.741776617129208]
Federated learning (FL) has emerged as a key technique for distributed machine learning (ML)
We first formulate rectangular scheduling steps and functions to capture the impact of system parameters on learning performance.
Our analysis sheds light on the joint impact of device training variables and asynchronous scheduling decisions.
arXiv Detail & Related papers (2023-05-22T21:39:38Z) - A Generative Approach for Production-Aware Industrial Network Traffic
Modeling [70.46446906513677]
We investigate the network traffic data generated from a laser cutting machine deployed in a Trumpf factory in Germany.
We analyze the traffic statistics, capture the dependencies between the internal states of the machine, and model the network traffic as a production state dependent process.
We compare the performance of various generative models including variational autoencoder (VAE), conditional variational autoencoder (CVAE), and generative adversarial network (GAN)
arXiv Detail & Related papers (2022-11-11T09:46:58Z) - MetaNetwork: A Task-agnostic Network Parameters Generation Framework for
Improving Device Model Generalization [65.02542875281233]
We propose a novel task-agnostic framework, named MetaNetwork, for generating adaptive device model parameters from cloud without on-device training.
The MetaGenerator is designed to learn a mapping function from samples to model parameters, and it can generate and deliver the adaptive parameters to the device based on samples uploaded from the device to the cloud.
The MetaStabilizer aims to reduce the oscillation of the MetaGenerator, accelerate the convergence and improve the model performance during both training and inference.
arXiv Detail & Related papers (2022-09-12T13:26:26Z) - Edge Federated Learning Via Unit-Modulus Over-The-Air Computation
(Extended Version) [64.76619508293966]
This paper proposes a unit-modulus over-the-air computation (UM-AirComp) framework to facilitate efficient edge federated learning.
It uploads simultaneously local model parameters and updates global model parameters via analog beamforming.
We demonstrate the implementation of UM-AirComp in a vehicle-to-everything autonomous driving simulation platform.
arXiv Detail & Related papers (2021-01-28T15:10:22Z) - SensiX: A Platform for Collaborative Machine Learning on the Edge [69.1412199244903]
We present SensiX, a personal edge platform that stays between sensor data and sensing models.
We demonstrate its efficacy in developing motion and audio-based multi-device sensing systems.
Our evaluation shows that SensiX offers a 7-13% increase in overall accuracy and up to 30% increase across different environment dynamics at the expense of 3mW power overhead.
arXiv Detail & Related papers (2020-12-04T23:06:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.