Related papers: TinyMLOps: Operational Challenges for Widespread Edge AI Adoption

TinyMLOps: Operational Challenges for Widespread Edge AI Adoption

URL: http://arxiv.org/abs/2203.10923v1
Date: Mon, 21 Mar 2022 12:36:12 GMT
Title: TinyMLOps: Operational Challenges for Widespread Edge AI Adoption
Authors: Sam Leroux, Pieter Simoens, Meelis Lootus, Kartik Kathore, Akshay Sharma
Abstract summary: We list several challenges that a TinyML practitioner might need to consider when operationalizing an application on edge devices. We focus on tasks such as monitoring and managing the application, common functionality for a MLOps platform, and show how they are complicated by the distributed nature of edge deployment.
Score: 4.110617007156225
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deploying machine learning applications on edge devices can bring clear benefits such as improved reliability, latency and privacy but it also introduces its own set of challenges. Most works focus on the limited computational resources of edge platforms but this is not the only bottleneck standing in the way of widespread adoption. In this paper we list several other challenges that a TinyML practitioner might need to consider when operationalizing an application on edge devices. We focus on tasks such as monitoring and managing the application, common functionality for a MLOps platform, and show how they are complicated by the distributed nature of edge deployment. We also discuss issues that are unique to edge applications such as protecting a model's intellectual property and verifying its integrity.

Related papers

Tin-Tin: Towards Tiny Learning on Tiny Devices with Integer-based Neural Network Training [16.821900475733102]
Tin-Tin is an integer-based on-device training framework for low-power microcontrollers (MCUs) We introduce novel integer rescaling techniques to efficiently manage dynamic ranges and facilitate efficient weight updates. We validate the effectiveness of Tin-Tin through end-to-end application examples on real-world tiny devices.
arXiv Detail & Related papers (2025-04-13T02:21:24Z)
Sometimes Painful but Certainly Promising: Feasibility and Trade-offs of Language Model Inference at the Edge [3.1471494780647795]
Recent trends show a growing focus on compact models-typically under 10 billion parameters-enabled by techniques such as quantization. This shift paves the way for LMs on edge devices, offering potential benefits such as enhanced privacy, reduced latency, and improved data sovereignty. We present a comprehensive evaluation of generative LM inference on representative CPU-based and GPU-accelerated edge devices.
arXiv Detail & Related papers (2025-03-12T07:01:34Z)
CAMPHOR: Collaborative Agents for Multi-input Planning and High-Order Reasoning On Device [2.4100803794273005]
We introduce an on-device Small Language Models (SLMs) framework designed to handle multiple user inputs and reason over personal context locally. CAMPHOR employs a hierarchical architecture where a high-order reasoning agent decomposes complex tasks and coordinates expert agents responsible for personal context retrieval, tool interaction, and dynamic plan generation. By implementing parameter sharing across agents and leveraging prompt compression, we significantly reduce model size, latency, and memory usage.
arXiv Detail & Related papers (2024-10-12T07:28:10Z)
Efficient Multi-Object Tracking on Edge Devices via Reconstruction-Based Channel Pruning [0.2302001830524133]
We propose a neural network pruning method specifically tailored to compress complex networks, such as those used in modern MOT systems. We achieve model size reductions of up to 70% while maintaining a high level of accuracy and further improving performance on the Jetson Orin Nano.
arXiv Detail & Related papers (2024-10-11T12:37:42Z)
MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains [54.117238759317004]
Massive Multitask Agent Understanding (MMAU) benchmark features comprehensive offline tasks that eliminate the need for complex environment setups. It evaluates models across five domains, including Tool-use, Directed Acyclic Graph (DAG) QA, Data Science and Machine Learning coding, Contest-level programming and Mathematics. With a total of 20 meticulously designed tasks encompassing over 3K distinct prompts, MMAU provides a comprehensive framework for evaluating the strengths and limitations of LLM agents.
arXiv Detail & Related papers (2024-07-18T00:58:41Z)
Galaxy: A Resource-Efficient Collaborative Edge AI System for In-situ Transformer Inference [19.60655813679882]
Transformer-based models have unlocked a plethora of powerful intelligent applications at the edge. Traditional deployment approaches offload the inference workloads to the remote cloud server. We propose Galaxy, a collaborative edge AI system that breaks the resource walls across heterogeneous edge devices.
arXiv Detail & Related papers (2024-05-27T15:01:04Z)
A General Framework for Learning from Weak Supervision [93.89870459388185]
This paper introduces a general framework for learning from weak supervision (GLWS) with a novel algorithm. Central to GLWS is an Expectation-Maximization (EM) formulation, adeptly accommodating various weak supervision sources. We also present an advanced algorithm that significantly simplifies the EM computational demands.
arXiv Detail & Related papers (2024-02-02T21:48:50Z)
A review of TinyML [0.0]
The TinyML concept for embedded machine learning attempts to push such diversity from usual high-end approaches to low-end applications. TinyML is a rapidly expanding interdisciplinary topic at the convergence of machine learning, software, and hardware. This paper explores how TinyML can benefit a few specific industrial fields, its obstacles, and its future scope.
arXiv Detail & Related papers (2022-11-05T06:02:08Z)
Design Automation for Fast, Lightweight, and Effective Deep Learning Models: A Survey [53.258091735278875]
This survey covers studies of design automation techniques for deep learning models targeting edge computing. It offers an overview and comparison of key metrics that are used commonly to quantify the proficiency of models in terms of effectiveness, lightness, and computational costs. The survey proceeds to cover three categories of the state-of-the-art of deep model design automation techniques.
arXiv Detail & Related papers (2022-08-22T12:12:43Z)
Auto-Split: A General Framework of Collaborative Edge-Cloud AI [49.750972428032355]
This paper describes the techniques and engineering practice behind Auto-Split, an edge-cloud collaborative prototype of Huawei Cloud. To the best of our knowledge, there is no existing industry product that provides the capability of Deep Neural Network (DNN) splitting.
arXiv Detail & Related papers (2021-08-30T08:03:29Z)
Towards AIOps in Edge Computing Environments [60.27785717687999]
This paper describes the system design of an AIOps platform which is applicable in heterogeneous, distributed environments. It is feasible to collect metrics with a high frequency and simultaneously run specific anomaly detection algorithms directly on edge devices.
arXiv Detail & Related papers (2021-02-12T09:33:00Z)
TinyML for Ubiquitous Edge AI [0.0]
TinyML focuses on enabling deep learning algorithms on embedded (microcontroller powered) devices operating at extremely low power range (mW range and below) TinyML addresses the challenges in designing power-efficient, compact deep neural network models, supporting software framework, and embedded hardware. In this report, we discuss the major challenges and technological enablers that direct this field's expansion.
arXiv Detail & Related papers (2021-02-02T02:04:54Z)
A Unified Object Motion and Affinity Model for Online Multi-Object Tracking [127.5229859255719]
We propose a novel MOT framework that unifies object motion and affinity model into a single network, named UMA. UMA integrates single object tracking and metric learning into a unified triplet network by means of multi-task learning. We equip our model with a task-specific attention module, which is used to boost task-aware feature learning.
arXiv Detail & Related papers (2020-03-25T09:36:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.