Related papers: TensorFlow Lite Micro: Embedded Machine Learning on TinyML Systems

TensorFlow Lite Micro: Embedded Machine Learning on TinyML Systems

URL: http://arxiv.org/abs/2010.08678v3
Date: Sat, 13 Mar 2021 13:41:01 GMT
Title: TensorFlow Lite Micro: Embedded Machine Learning on TinyML Systems
Authors: Robert David, Jared Duke, Advait Jain, Vijay Janapa Reddi, Nat Jeffries, Jian Li, Nick Kreeger, Ian Nappier, Meghna Natraj, Shlomi Regev, Rocky Rhodes, Tiezhen Wang, Pete Warden
Abstract summary: Deep learning inference on embedded devices is a burgeoning field with myriad applications because tiny embedded devices are omnipresent. Deep learning inference on embedded devices is a burgeoning field with myriad applications because tiny embedded devices are omnipresent. We introduce Lite Micro, an open-source ML inference framework for running deep-learning models on embedded systems.
Score: 5.188829601887422
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep learning inference on embedded devices is a burgeoning field with myriad applications because tiny embedded devices are omnipresent. But we must overcome major challenges before we can benefit from this opportunity. Embedded processors are severely resource constrained. Their nearest mobile counterparts exhibit at least a 100 -- 1,000x difference in compute capability, memory availability, and power consumption. As a result, the machine-learning (ML) models and associated ML inference framework must not only execute efficiently but also operate in a few kilobytes of memory. Also, the embedded devices' ecosystem is heavily fragmented. To maximize efficiency, system vendors often omit many features that commonly appear in mainstream systems, including dynamic memory allocation and virtual memory, that allow for cross-platform interoperability. The hardware comes in many flavors (e.g., instruction-set architecture and FPU support, or lack thereof). We introduce TensorFlow Lite Micro (TF Micro), an open-source ML inference framework for running deep-learning models on embedded systems. TF Micro tackles the efficiency requirements imposed by embedded-system resource constraints and the fragmentation challenges that make cross-platform interoperability nearly impossible. The framework adopts a unique interpreter-based approach that provides flexibility while overcoming these challenges. This paper explains the design decisions behind TF Micro and describes its implementation details. Also, we present an evaluation to demonstrate its low resource requirement and minimal run-time performance overhead.

Related papers

Tin-Tin: Towards Tiny Learning on Tiny Devices with Integer-based Neural Network Training [16.821900475733102]
Tin-Tin is an integer-based on-device training framework for low-power microcontrollers (MCUs) We introduce novel integer rescaling techniques to efficiently manage dynamic ranges and facilitate efficient weight updates. We validate the effectiveness of Tin-Tin through end-to-end application examples on real-world tiny devices.
arXiv Detail & Related papers (2025-04-13T02:21:24Z)
MicroFlow: An Efficient Rust-Based Inference Engine for TinyML [1.8902208722501446]
MicroFlow is an open-source framework for the deployment of Neural Networks (NNs) on embedded systems using the Rust programming language. It is able to use less Flash and RAM memory than other state-of-the-art solutions for deploying NN reference models. It can also achieve faster inference compared to existing engines on medium-size NNs, and similar performance on bigger ones.
arXiv Detail & Related papers (2024-09-28T18:34:27Z)
Designing and Implementing a Generator Framework for a SIMD Abstraction Library [53.84310825081338]
We present TSLGen, a novel end-to-end framework for generating an SIMD abstraction library. We show that our framework is comparable to existing libraries, and we achieve the same performance results.
arXiv Detail & Related papers (2024-07-26T13:25:38Z)
Distributed Inference and Fine-tuning of Large Language Models Over The Internet [91.00270820533272]
Large language models (LLMs) are useful in many NLP tasks and become more capable with size. These models require high-end hardware, making them inaccessible to most researchers. We develop fault-tolerant inference algorithms and load-balancing protocols that automatically assign devices to maximize the total system throughput.
arXiv Detail & Related papers (2023-12-13T18:52:49Z)
Federated Fine-Tuning of LLMs on the Very Edge: The Good, the Bad, the Ugly [62.473245910234304]
This paper takes a hardware-centric approach to explore how Large Language Models can be brought to modern edge computing systems. We provide a micro-level hardware benchmark, compare the model FLOP utilization to a state-of-the-art data center GPU, and study the network utilization in realistic conditions.
arXiv Detail & Related papers (2023-10-04T20:27:20Z)
FusionAI: Decentralized Training and Deploying LLMs with Massive Consumer-Level GPUs [57.12856172329322]
We envision a decentralized system unlocking the potential vast untapped consumer-level GPU. This system faces critical challenges, including limited CPU and GPU memory, low network bandwidth, the variability of peer and device heterogeneity.
arXiv Detail & Related papers (2023-09-03T13:27:56Z)
FLEdge: Benchmarking Federated Machine Learning Applications in Edge Computing Systems [61.335229621081346]
Federated Learning (FL) has become a viable technique for realizing privacy-enhancing distributed deep learning on the network edge. In this paper, we propose FLEdge, which complements existing FL benchmarks by enabling a systematic evaluation of client capabilities.
arXiv Detail & Related papers (2023-06-08T13:11:20Z)
Energy-efficient Task Adaptation for NLP Edge Inference Leveraging Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks. We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z)
Virtualization of Tiny Embedded Systems with a robust real-time capable and extensible Stack Virtual Machine REXAVM supporting Material-integrated Intelligent Systems and Tiny Machine Learning [0.0]
This paper shows and evaluates the suitability of the proposed VM architecture for operationally equivalent software and hardware (FPGA) implementations. In a holistic architecture approach, the VM specifically addresses digital signal processing and tiny machine learning.
arXiv Detail & Related papers (2023-02-17T17:13:35Z)
Experimenting with Emerging RISC-V Systems for Decentralised Machine Learning [12.18598759507803]
Decentralised Machine Learning (DML) enables collaborative machine learning without centralised input data. We map DML schemes to an underlying parallel programming library. We experiment with it by generating different working DML schemes on x86-64 and ARM platforms and an emerging RISC-V one. As a byproduct, we introduce a RISC-V porting of the PyTorch framework, the first publicly available to our knowledge.
arXiv Detail & Related papers (2023-02-15T20:57:42Z)
A review of TinyML [0.0]
The TinyML concept for embedded machine learning attempts to push such diversity from usual high-end approaches to low-end applications. TinyML is a rapidly expanding interdisciplinary topic at the convergence of machine learning, software, and hardware. This paper explores how TinyML can benefit a few specific industrial fields, its obstacles, and its future scope.
arXiv Detail & Related papers (2022-11-05T06:02:08Z)
TinyML for Ubiquitous Edge AI [0.0]
TinyML focuses on enabling deep learning algorithms on embedded (microcontroller powered) devices operating at extremely low power range (mW range and below) TinyML addresses the challenges in designing power-efficient, compact deep neural network models, supporting software framework, and embedded hardware. In this report, we discuss the major challenges and technological enablers that direct this field's expansion.
arXiv Detail & Related papers (2021-02-02T02:04:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.