Related papers: Pareto Optimal Benchmarking of AI Models on ARM Cortex Processors for Sustainable Embedded Systems

Pareto Optimal Benchmarking of AI Models on ARM Cortex Processors for Sustainable Embedded Systems

URL: http://arxiv.org/abs/2602.17508v2
Date: Fri, 20 Feb 2026 09:34:51 GMT
Title: Pareto Optimal Benchmarking of AI Models on ARM Cortex Processors for Sustainable Embedded Systems
Authors: Pranay Jain, Maximilian Kasper, Göran Köber, Oliver Amft, Axel Plinge, Dominik Seuß,
Abstract summary: The research highlights a nearlinear correlation between floating-point operations (FLOPs) and inference time, offering a reliable metric for estimating computational demands.<n>We show how to balance trade-offs between energy consumption and model accuracy, ensuring that AI applications meet performance requirements without compromising sustainability.<n>This work provides insights for developers, guiding them to design energy-efficient AI systems that deliver high performance in realworld applications.
Score: 2.584048323685663
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This work presents a practical benchmarking framework for optimizing artificial intelligence (AI) models on ARM Cortex processors (M0+, M4, M7), focusing on energy efficiency, accuracy, and resource utilization in embedded systems. Through the design of an automated test bench, we provide a systematic approach to evaluate across key performance indicators (KPIs) and identify optimal combinations of processor and AI model. The research highlights a nearlinear correlation between floating-point operations (FLOPs) and inference time, offering a reliable metric for estimating computational demands. Using Pareto analysis, we demonstrate how to balance trade-offs between energy consumption and model accuracy, ensuring that AI applications meet performance requirements without compromising sustainability. Key findings indicate that the M7 processor is ideal for short inference cycles, while the M4 processor offers better energy efficiency for longer inference tasks. The M0+ processor, while less efficient for complex AI models, remains suitable for simpler tasks. This work provides insights for developers, guiding them to design energy-efficient AI systems that deliver high performance in realworld applications.

Related papers

AI-NativeBench: An Open-Source White-Box Agentic Benchmark Suite for AI-Native Systems [52.65695508605237]
We introduce AI-NativeBench, the first application-centric and white-box AI-Native benchmark suite grounded in Model Context Protocol (MCP) and Agent-to-Agent (A2A) standards.<n>By treating agentic spans as first-class citizens within distributed traces, our methodology enables granular analysis of engineering characteristics beyond simple capabilities.<n>This work provides the first systematic evidence to guide the transition from measuring model capability to engineering reliable AI-Native systems.
arXiv Detail & Related papers (2026-01-14T11:32:07Z)
What Artificial Intelligence can do for High-Performance Computing systems? [0.0]
This review assesses how artificial (AI) including machine learning (ML) and optimization, improves the efficiency of operational HPC systems.<n>Approximately 1,800 publications from 2019 to 2025 were manually screened using predefined inclusion/exclusion criteria.<n>74 "AI for HPC" papers were retained and grouped into six application areas: performance estimation, performance optimization, scheduling, surrogate modeling, fault detection, and language-model-based automation.
arXiv Detail & Related papers (2026-01-03T19:25:23Z)
Smart but Costly? Benchmarking LLMs on Functional Accuracy and Energy Efficiency [5.771786260272727]
We present a framework, BRACE, to benchmark Code Language Models on a unified scale of energy efficiency and functional correctness.<n>We propose two rating methods: Concentric Incremental Rating Circles (CIRC) and Observation to Expectation Rating (OTER)<n>Our analysis reveals models generally perform better in the code summarization tasks as they are not enforced to generate a grammar-based and syntactically correct output.
arXiv Detail & Related papers (2025-11-10T23:44:48Z)
Metrics and evaluations for computational and sustainable AI efficiency [26.52588349722099]
Current approaches fail to provide a holistic view, making it difficult to compare and optimise systems.<n>We propose a unified and reproducible methodology for AI model inference that integrates computational and environmental metrics.<n>Our framework provides pragmatic, carbon-aware evaluation by systematically measuring latency and distributions throughput, energy consumption, and location-adjusted carbon emissions.
arXiv Detail & Related papers (2025-10-18T03:30:15Z)
EfficientLLM: Efficiency in Large Language Models [64.3537131208038]
Large Language Models (LLMs) have driven significant progress, yet their growing counts and context windows incur prohibitive compute, energy, and monetary costs.<n>We introduce EfficientLLM, a novel benchmark and the first comprehensive empirical study evaluating efficiency techniques for LLMs at scale.
arXiv Detail & Related papers (2025-05-20T02:27:08Z)
Edge-Cloud Collaborative Computing on Distributed Intelligence and Model Optimization: A Survey [58.50944604905037]
Edge-cloud collaborative computing (ECCC) has emerged as a pivotal paradigm for addressing the computational demands of modern intelligent applications.<n>Recent advancements in AI, particularly deep learning and large language models (LLMs), have dramatically enhanced the capabilities of these distributed systems.<n>This survey provides a structured tutorial on fundamental architectures, enabling technologies, and emerging applications.
arXiv Detail & Related papers (2025-05-03T13:55:38Z)
A Survey on Inference Optimization Techniques for Mixture of Experts Models [50.40325411764262]
Large-scale Mixture of Experts (MoE) models offer enhanced model capacity and computational efficiency through conditional computation.<n> deploying and running inference on these models presents significant challenges in computational resources, latency, and energy efficiency.<n>This survey analyzes optimization techniques for MoE models across the entire system stack.
arXiv Detail & Related papers (2024-12-18T14:11:15Z)
Synergistic Development of Perovskite Memristors and Algorithms for Robust Analog Computing [53.77822620185878]
We propose a synergistic methodology to concurrently optimize perovskite memristor fabrication and develop robust analog DNNs.<n>We develop "BayesMulti", a training strategy utilizing BO-guided noise injection to improve the resistance of analog DNNs to memristor imperfections.<n>Our integrated approach enables use of analog computing in much deeper and wider networks, achieving up to 100-fold improvements.
arXiv Detail & Related papers (2024-12-03T19:20:08Z)
An Energy-Aware Approach to Design Self-Adaptive AI-based Applications on the Edge [42.462246527457594]
We present an energy-aware approach for the design and deployment of self-adaptive AI-based applications. We address the problem of determining the set of configurations that can be used to self-adapt the system with a meta-heuristic search procedure. Results show that our self-adaptive application can outperform non-adaptive baseline configurations by saving up to 81% of energy while loosing only between 2% and 6% in accuracy.
arXiv Detail & Related papers (2023-08-31T09:33:44Z)
A Comparative Study of Machine Learning Algorithms for Anomaly Detection in Industrial Environments: Performance and Environmental Impact [62.997667081978825]
This study seeks to address the demands of high-performance machine learning models with environmental sustainability. Traditional machine learning algorithms, such as Decision Trees and Random Forests, demonstrate robust efficiency and performance. However, superior outcomes were obtained with optimised configurations, albeit with a commensurate increase in resource consumption.
arXiv Detail & Related papers (2023-07-01T15:18:00Z)
AIPerf: Automated machine learning as an AI-HPC benchmark [17.57686674304368]
We propose an end-to-end benchmark suite utilizing automated machine learning (AutoML) We implement the algorithms in a highly parallel and flexible way to ensure the efficiency and optimization potential on diverse systems. With flexible workload and single metric, our benchmark can scale and rank AI- HPC easily.
arXiv Detail & Related papers (2020-08-17T08:06:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.