MLPerf Tiny Benchmark
- URL: http://arxiv.org/abs/2106.07597v1
- Date: Mon, 14 Jun 2021 17:05:17 GMT
- Title: MLPerf Tiny Benchmark
- Authors: Colby Banbury, Vijay Janapa Reddi, Peter Torelli, Jeremy Holleman, Nat
Jeffries, Csaba Kiraly, Pietro Montino, David Kanter, Sebastian Ahmed, Danilo
Pau, Urmish Thakker, Antonio Torrini, Peter Warden, Jay Cordaro, Giuseppe Di
Guglielmo, Javier Duarte, Stephen Gibellini, Videet Parekh, Honson Tran, Nhan
Tran, Niu Wenxu, Xu Xuesong
- Abstract summary: We present Tinyerf Tiny, the first industry-standard benchmark suite for ultra-low-power tiny machine learning systems.
Tinyerf Tiny measures the accuracy, latency, and energy of machine learning inference to properly evaluate the tradeoffs between systems.
- Score: 1.1178096184080788
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Advancements in ultra-low-power tiny machine learning (TinyML) systems
promise to unlock an entirely new class of smart applications. However,
continued progress is limited by the lack of a widely accepted and easily
reproducible benchmark for these systems. To meet this need, we present MLPerf
Tiny, the first industry-standard benchmark suite for ultra-low-power tiny
machine learning systems. The benchmark suite is the collaborative effort of
more than 50 organizations from industry and academia and reflects the needs of
the community. MLPerf Tiny measures the accuracy, latency, and energy of
machine learning inference to properly evaluate the tradeoffs between systems.
Additionally, MLPerf Tiny implements a modular design that enables benchmark
submitters to show the benefits of their product, regardless of where it falls
on the ML deployment stack, in a fair and reproducible manner. The suite
features four benchmarks: keyword spotting, visual wake words, image
classification, and anomaly detection.
Related papers
- DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution [114.61347672265076]
Development of MLLMs for real-world robots is challenging due to the typically limited computation and memory capacities available on robotic platforms.
We propose a Dynamic Early-Exit Framework for Robotic Vision-Language-Action Model (DeeR) that automatically adjusts the size of the activated MLLM.
DeeR demonstrates significant reductions in computational costs of LLM by 5.2-6.5x and GPU memory of LLM by 2-6x without compromising performance.
arXiv Detail & Related papers (2024-11-04T18:26:08Z) - On-device Online Learning and Semantic Management of TinyML Systems [8.183732025472766]
This study aims to bridge the gap between prototyping single TinyML models and developing reliable TinyML systems in production.
We propose online learning to enable training on constrained devices, adapting local models towards the latest field conditions.
We present semantic management for the joint management of models and devices at scale.
arXiv Detail & Related papers (2024-05-13T10:03:34Z) - SEED-Bench-2: Benchmarking Multimodal Large Language Models [67.28089415198338]
Multimodal large language models (MLLMs) have recently demonstrated exceptional capabilities in generating not only texts but also images given interleaved multimodal inputs.
SEED-Bench-2 comprises 24K multiple-choice questions with accurate human annotations, which spans 27 dimensions.
We evaluate the performance of 23 prominent open-source MLLMs and summarize valuable observations.
arXiv Detail & Related papers (2023-11-28T05:53:55Z) - ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code [76.84199699772903]
ML-Bench is a benchmark rooted in real-world programming applications that leverage existing code repositories to perform tasks.
To evaluate both Large Language Models (LLMs) and AI agents, two setups are employed: ML-LLM-Bench for assessing LLMs' text-to-code conversion within a predefined deployment environment, and ML-Agent-Bench for testing autonomous agents in an end-to-end task execution within a Linux sandbox environment.
arXiv Detail & Related papers (2023-11-16T12:03:21Z) - Is ChatGPT Good at Search? Investigating Large Language Models as
Re-Ranking Agents [56.104476412839944]
Large Language Models (LLMs) have demonstrated remarkable zero-shot generalization across various language-related tasks.
This paper investigates generative LLMs for relevance ranking in Information Retrieval (IR)
To address concerns about data contamination of LLMs, we collect a new test set called NovelEval.
To improve efficiency in real-world applications, we delve into the potential for distilling the ranking capabilities of ChatGPT into small specialized models.
arXiv Detail & Related papers (2023-04-19T10:16:03Z) - TinyReptile: TinyML with Federated Meta-Learning [9.618821589196624]
We propose TinyReptile, a simple but efficient algorithm inspired by meta-learning and online learning.
We demonstrate TinyReptile on Raspberry Pi 4 and Cortex-M4 MCU with only 256-KB RAM.
arXiv Detail & Related papers (2023-04-11T13:11:10Z) - Incremental Online Learning Algorithms Comparison for Gesture and Visual
Smart Sensors [68.8204255655161]
This paper compares four state-of-the-art algorithms in two real applications: gesture recognition based on accelerometer data and image classification.
Our results confirm these systems' reliability and the feasibility of deploying them in tiny-memory MCUs.
arXiv Detail & Related papers (2022-09-01T17:05:20Z) - TinyML Platforms Benchmarking [0.0]
Recent advances in ultra-low power embedded devices for machine learning (ML) have permitted a new class of products.
TinyML provides a unique solution by aggregating and analyzing data at the edge on low-power embedded devices.
Many TinyML frameworks have been developed for different platforms to facilitate the deployment of ML models.
arXiv Detail & Related papers (2021-11-30T15:26:26Z) - MLHarness: A Scalable Benchmarking System for MLCommons [16.490366217665205]
We propose a scalable benchmarking harness system for MLCommons Inference.
It codifies the standard benchmark process as defined by MLCommons Inference.
It provides an easy and declarative approach for model developers to contribute their models and datasets to MLCommons Inference.
arXiv Detail & Related papers (2021-11-09T16:11:49Z) - The Benchmark Lottery [114.43978017484893]
"A benchmark lottery" describes the overall fragility of the machine learning benchmarking process.
We show that the relative performance of algorithms may be altered significantly simply by choosing different benchmark tasks.
arXiv Detail & Related papers (2021-07-14T21:08:30Z) - Benchmarking TinyML Systems: Challenges and Direction [10.193715318589812]
We present the current landscape of TinyML and discuss the challenges and direction towards developing a fair and useful hardware benchmark for TinyML workloads.
Our viewpoints reflect the collective thoughts of the TinyMLPerf working group that is comprised of over 30 organizations.
arXiv Detail & Related papers (2020-03-10T15:58:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.