TinyMLOps: Operational Challenges for Widespread Edge AI Adoption
- URL: http://arxiv.org/abs/2203.10923v1
- Date: Mon, 21 Mar 2022 12:36:12 GMT
- Title: TinyMLOps: Operational Challenges for Widespread Edge AI Adoption
- Authors: Sam Leroux, Pieter Simoens, Meelis Lootus, Kartik Kathore, Akshay
Sharma
- Abstract summary: We list several challenges that a TinyML practitioner might need to consider when operationalizing an application on edge devices.
We focus on tasks such as monitoring and managing the application, common functionality for a MLOps platform, and show how they are complicated by the distributed nature of edge deployment.
- Score: 4.110617007156225
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deploying machine learning applications on edge devices can bring clear
benefits such as improved reliability, latency and privacy but it also
introduces its own set of challenges. Most works focus on the limited
computational resources of edge platforms but this is not the only bottleneck
standing in the way of widespread adoption. In this paper we list several other
challenges that a TinyML practitioner might need to consider when
operationalizing an application on edge devices. We focus on tasks such as
monitoring and managing the application, common functionality for a MLOps
platform, and show how they are complicated by the distributed nature of edge
deployment. We also discuss issues that are unique to edge applications such as
protecting a model's intellectual property and verifying its integrity.
Related papers
- CAMPHOR: Collaborative Agents for Multi-input Planning and High-Order Reasoning On Device [2.4100803794273005]
We introduce an on-device Small Language Models (SLMs) framework designed to handle multiple user inputs and reason over personal context locally.
CAMPHOR employs a hierarchical architecture where a high-order reasoning agent decomposes complex tasks and coordinates expert agents responsible for personal context retrieval, tool interaction, and dynamic plan generation.
By implementing parameter sharing across agents and leveraging prompt compression, we significantly reduce model size, latency, and memory usage.
arXiv Detail & Related papers (2024-10-12T07:28:10Z) - Efficient Multi-Object Tracking on Edge Devices via Reconstruction-Based Channel Pruning [0.2302001830524133]
We propose a neural network pruning method specifically tailored to compress complex networks, such as those used in modern MOT systems.
We achieve model size reductions of up to 70% while maintaining a high level of accuracy and further improving performance on the Jetson Orin Nano.
arXiv Detail & Related papers (2024-10-11T12:37:42Z) - MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains [54.117238759317004]
Massive Multitask Agent Understanding (MMAU) benchmark features comprehensive offline tasks that eliminate the need for complex environment setups.
It evaluates models across five domains, including Tool-use, Directed Acyclic Graph (DAG) QA, Data Science and Machine Learning coding, Contest-level programming and Mathematics.
With a total of 20 meticulously designed tasks encompassing over 3K distinct prompts, MMAU provides a comprehensive framework for evaluating the strengths and limitations of LLM agents.
arXiv Detail & Related papers (2024-07-18T00:58:41Z) - Galaxy: A Resource-Efficient Collaborative Edge AI System for In-situ Transformer Inference [19.60655813679882]
Transformer-based models have unlocked a plethora of powerful intelligent applications at the edge.
Traditional deployment approaches offload the inference workloads to the remote cloud server.
We propose Galaxy, a collaborative edge AI system that breaks the resource walls across heterogeneous edge devices.
arXiv Detail & Related papers (2024-05-27T15:01:04Z) - A General Framework for Learning from Weak Supervision [93.89870459388185]
This paper introduces a general framework for learning from weak supervision (GLWS) with a novel algorithm.
Central to GLWS is an Expectation-Maximization (EM) formulation, adeptly accommodating various weak supervision sources.
We also present an advanced algorithm that significantly simplifies the EM computational demands.
arXiv Detail & Related papers (2024-02-02T21:48:50Z) - A review of TinyML [0.0]
The TinyML concept for embedded machine learning attempts to push such diversity from usual high-end approaches to low-end applications.
TinyML is a rapidly expanding interdisciplinary topic at the convergence of machine learning, software, and hardware.
This paper explores how TinyML can benefit a few specific industrial fields, its obstacles, and its future scope.
arXiv Detail & Related papers (2022-11-05T06:02:08Z) - Design Automation for Fast, Lightweight, and Effective Deep Learning
Models: A Survey [53.258091735278875]
This survey covers studies of design automation techniques for deep learning models targeting edge computing.
It offers an overview and comparison of key metrics that are used commonly to quantify the proficiency of models in terms of effectiveness, lightness, and computational costs.
The survey proceeds to cover three categories of the state-of-the-art of deep model design automation techniques.
arXiv Detail & Related papers (2022-08-22T12:12:43Z) - Auto-Split: A General Framework of Collaborative Edge-Cloud AI [49.750972428032355]
This paper describes the techniques and engineering practice behind Auto-Split, an edge-cloud collaborative prototype of Huawei Cloud.
To the best of our knowledge, there is no existing industry product that provides the capability of Deep Neural Network (DNN) splitting.
arXiv Detail & Related papers (2021-08-30T08:03:29Z) - Towards AIOps in Edge Computing Environments [60.27785717687999]
This paper describes the system design of an AIOps platform which is applicable in heterogeneous, distributed environments.
It is feasible to collect metrics with a high frequency and simultaneously run specific anomaly detection algorithms directly on edge devices.
arXiv Detail & Related papers (2021-02-12T09:33:00Z) - TinyML for Ubiquitous Edge AI [0.0]
TinyML focuses on enabling deep learning algorithms on embedded (microcontroller powered) devices operating at extremely low power range (mW range and below)
TinyML addresses the challenges in designing power-efficient, compact deep neural network models, supporting software framework, and embedded hardware.
In this report, we discuss the major challenges and technological enablers that direct this field's expansion.
arXiv Detail & Related papers (2021-02-02T02:04:54Z) - A Unified Object Motion and Affinity Model for Online Multi-Object
Tracking [127.5229859255719]
We propose a novel MOT framework that unifies object motion and affinity model into a single network, named UMA.
UMA integrates single object tracking and metric learning into a unified triplet network by means of multi-task learning.
We equip our model with a task-specific attention module, which is used to boost task-aware feature learning.
arXiv Detail & Related papers (2020-03-25T09:36:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.