Related papers: MLonMCU: TinyML Benchmarking with Fast Retargeting

MLonMCU: TinyML Benchmarking with Fast Retargeting

URL: http://arxiv.org/abs/2306.08951v1
Date: Thu, 15 Jun 2023 08:44:35 GMT
Title: MLonMCU: TinyML Benchmarking with Fast Retargeting
Authors: Philipp van Kempen, Rafael Stahl, Daniel Mueller-Gritschneder, Ulf Schlichtmann
Abstract summary: It is non-trivial to choose the optimal combination of frameworks and targets for a given application. A tool called MLonMCU is proposed in this paper and demonstrated by benchmarking the state-of-the-art TinyML frameworks TFLite for Microcontrollers and TVM effortlessly.
Score: 1.4319942396517
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While there exist many ways to deploy machine learning models on microcontrollers, it is non-trivial to choose the optimal combination of frameworks and targets for a given application. Thus, automating the end-to-end benchmarking flow is of high relevance nowadays. A tool called MLonMCU is proposed in this paper and demonstrated by benchmarking the state-of-the-art TinyML frameworks TFLite for Microcontrollers and TVM effortlessly with a large number of configurations in a low amount of time.

Related papers

AtomThink: A Slow Thinking Framework for Multimodal Mathematical Reasoning [70.95645743670062]
AtomThink is a framework for constructing long chains of thought (CoT) in a step-by-step manner, guiding MLLMs to perform complex reasoning. AtomMATH is a large-scale multimodal dataset of long CoTs, and an atomic capability evaluation metric for mathematical tasks. AtomThink significantly improves the performance of baseline MLLMs, achieving approximately 50% relative accuracy gains on MathVista and 120% on MathVerse.
arXiv Detail & Related papers (2024-11-18T11:54:58Z)
Align$^2$LLaVA: Cascaded Human and Large Language Model Preference Alignment for Multi-modal Instruction Curation [56.75665429851673]
This paper introduces a novel instruction curation algorithm, derived from two unique perspectives, human and LLM preference alignment. Experiments demonstrate that we can maintain or even improve model performance by compressing synthetic multimodal instructions by up to 90%.
arXiv Detail & Related papers (2024-09-27T08:20:59Z)
MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases [81.70591346986582]
We introduce MobileAIBench, a benchmarking framework for evaluating Large Language Models (LLMs) and Large Multimodal Models (LMMs) on mobile devices. MobileAIBench assesses models across different sizes, quantization levels, and tasks, measuring latency and resource consumption on real devices.
arXiv Detail & Related papers (2024-06-12T22:58:12Z)
LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit [55.73370804397226]
Quantization, a key compression technique, can effectively mitigate these demands by compressing and accelerating large language models. We present LLMC, a plug-and-play compression toolkit, to fairly and systematically explore the impact of quantization. Powered by this versatile toolkit, our benchmark covers three key aspects: calibration data, algorithms (three strategies), and data formats.
arXiv Detail & Related papers (2024-05-09T11:49:05Z)
The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation [93.01964988474755]
AutoMQM is a prompting technique which asks large language models to identify and categorize errors in translations. We study the impact of labeled data through in-context learning and finetuning. We then evaluate AutoMQM with PaLM-2 models, and we find that it improves performance compared to just prompting for scores.
arXiv Detail & Related papers (2023-08-14T17:17:21Z)
MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models [73.86954509967416]
Multimodal Large Language Model (MLLM) relies on the powerful LLM to perform multimodal tasks. This paper presents the first comprehensive MLLM Evaluation benchmark MME. It measures both perception and cognition abilities on a total of 14 subtasks.
arXiv Detail & Related papers (2023-06-23T09:22:36Z)
MEMA Runtime Framework: Minimizing External Memory Accesses for TinyML on Microcontrollers [3.1823074562424756]
We present the MEMA framework for efficient inference runtimes that minimize external memory accesses for matrix multiplication on TinyML systems. We compare the performance of runtimes derived from MEMA to existing state-of-the-art libraries on ARM-based TinyML systems.
arXiv Detail & Related papers (2023-04-12T00:27:11Z)
MinUn: Accurate ML Inference on Microcontrollers [2.2638536653874195]
Running machine learning inference on tiny devices, known as TinyML, is an emerging research area. We describe MinUn, the first TinyML framework that holistically addresses these issues to generate efficient code for ARM microcontrollers.
arXiv Detail & Related papers (2022-10-29T10:16:12Z)
TinyML Platforms Benchmarking [0.0]
Recent advances in ultra-low power embedded devices for machine learning (ML) have permitted a new class of products. TinyML provides a unique solution by aggregating and analyzing data at the edge on low-power embedded devices. Many TinyML frameworks have been developed for different platforms to facilitate the deployment of ML models.
arXiv Detail & Related papers (2021-11-30T15:26:26Z)
Memory-Based Optimization Methods for Model-Agnostic Meta-Learning and Personalized Federated Learning [56.17603785248675]
Model-agnostic meta-learning (MAML) has become a popular research area. Existing MAML algorithms rely on the episode' idea by sampling a few tasks and data points to update the meta-model at each iteration. This paper proposes memory-based algorithms for MAML that converge with vanishing error.
arXiv Detail & Related papers (2021-06-09T08:47:58Z)
MicroNets: Neural Network Architectures for Deploying TinyML Applications on Commodity Microcontrollers [18.662026553041937]
Machine learning on resource constrained microcontrollers (MCUs) promises to drastically expand the application space of the Internet of Things (IoT) TinyML presents severe technical challenges, as deep neural network inference demands a large compute and memory budget. neural architecture search (NAS) promises to help design accurate ML models that meet the tight MCU memory, latency and energy constraints.
arXiv Detail & Related papers (2020-10-21T19:39:39Z)
Benchmarking TinyML Systems: Challenges and Direction [10.193715318589812]
We present the current landscape of TinyML and discuss the challenges and direction towards developing a fair and useful hardware benchmark for TinyML workloads. Our viewpoints reflect the collective thoughts of the TinyMLPerf working group that is comprised of over 30 organizations.
arXiv Detail & Related papers (2020-03-10T15:58:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.