Ecomap: Sustainability-Driven Optimization of Multi-Tenant DNN Execution on Edge Servers
- URL: http://arxiv.org/abs/2503.04148v1
- Date: Thu, 06 Mar 2025 06:56:51 GMT
- Title: Ecomap: Sustainability-Driven Optimization of Multi-Tenant DNN Execution on Edge Servers
- Authors: Varatheepan Paramanayakam, Andreas Karatzas, Dimitrios Stamoulis, Iraklis Anagnostopoulos,
- Abstract summary: This paper introduces Ecomap, a framework that adjusts the maximum power threshold of edge devices based on real-time carbon intensity.<n> Experimental results using NVIDIA Jetson AGX Xavier demonstrate that Ecomap reduces carbon emissions by an average of 30%.
- Score: 0.44784055850794474
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Edge computing systems struggle to efficiently manage multiple concurrent deep neural network (DNN) workloads while meeting strict latency requirements, minimizing power consumption, and maintaining environmental sustainability. This paper introduces Ecomap, a sustainability-driven framework that dynamically adjusts the maximum power threshold of edge devices based on real-time carbon intensity. Ecomap incorporates the innovative use of mixed-quality models, allowing it to dynamically replace computationally heavy DNNs with lighter alternatives when latency constraints are violated, ensuring service responsiveness with minimal accuracy loss. Additionally, it employs a transformer-based estimator to guide efficient workload mappings. Experimental results using NVIDIA Jetson AGX Xavier demonstrate that Ecomap reduces carbon emissions by an average of 30% and achieves a 25% lower carbon delay product (CDP) compared to state-of-the-art methods, while maintaining comparable or better latency and power efficiency.
Related papers
- CarbonCall: Sustainability-Aware Function Calling for Large Language Models on Edge Devices [0.44784055850794474]
Large Language Models (LLMs) enable real-time function calling in edge AI systems but introduce significant computational overhead, leading to high power consumption and carbon emissions.
We introduce CarbonCall, a sustainability-aware function-calling framework that integrates dynamic tool selection, carbon-aware execution, and quantized adaptation.
Experiments on an NVIDIA Jetson AGX Orin show that CarbonCall reduces carbon emissions by up to 52%, power consumption by 30%, and execution time by 30%, while maintaining high efficiency.
arXiv Detail & Related papers (2025-04-29T01:37:08Z) - EPSILON: Adaptive Fault Mitigation in Approximate Deep Neural Network using Statistical Signatures [0.0]
We introduce EPSILON, a lightweight framework for efficient fault detection and mitigation in deep neural network accelerators (AxDNNs)
Our framework introduces a novel non-parametric pattern-matching algorithm that enables constant-time fault detection without interrupting normal execution.
EPSILON maintains model accuracy by intelligently adjusting mitigation strategies based on a statistical analysis of weight distribution and layer criticality.
arXiv Detail & Related papers (2025-04-24T20:37:37Z) - Task Delay and Energy Consumption Minimization for Low-altitude MEC via Evolutionary Multi-objective Deep Reinforcement Learning [52.64813150003228]
The low-altitude economy (LAE), driven by unmanned aerial vehicles (UAVs) and other aircraft, has revolutionized fields such as transportation, agriculture, and environmental monitoring.<n>In the upcoming six-generation (6G) era, UAV-assisted mobile edge computing (MEC) is particularly crucial in challenging environments such as mountainous or disaster-stricken areas.<n>The task offloading problem is one of the key issues in UAV-assisted MEC, primarily addressing the trade-off between minimizing the task delay and the energy consumption of the UAV.
arXiv Detail & Related papers (2025-01-11T02:32:42Z) - Synergistic Development of Perovskite Memristors and Algorithms for Robust Analog Computing [53.77822620185878]
We propose a synergistic methodology to concurrently optimize perovskite memristor fabrication and develop robust analog DNNs.<n>We develop "BayesMulti", a training strategy utilizing BO-guided noise injection to improve the resistance of analog DNNs to memristor imperfections.<n>Our integrated approach enables use of analog computing in much deeper and wider networks, achieving up to 100-fold improvements.
arXiv Detail & Related papers (2024-12-03T19:20:08Z) - Latency-aware Unified Dynamic Networks for Efficient Image Recognition [72.8951331472913]
LAUDNet is a framework to bridge the theoretical and practical efficiency gap in dynamic networks.
It integrates three primary dynamic paradigms-spatially adaptive computation, dynamic layer skipping, and dynamic channel skipping.
It can notably reduce the latency of models like ResNet by over 50% on platforms such as V100,3090, and TX2 GPUs.
arXiv Detail & Related papers (2023-08-30T10:57:41Z) - A Comparative Study of Machine Learning Algorithms for Anomaly Detection
in Industrial Environments: Performance and Environmental Impact [62.997667081978825]
This study seeks to address the demands of high-performance machine learning models with environmental sustainability.
Traditional machine learning algorithms, such as Decision Trees and Random Forests, demonstrate robust efficiency and performance.
However, superior outcomes were obtained with optimised configurations, albeit with a commensurate increase in resource consumption.
arXiv Detail & Related papers (2023-07-01T15:18:00Z) - Precision-aware Latency and Energy Balancing on Multi-Accelerator
Platforms for DNN Inference [22.9834921448069]
We propose ODiMO, a hardware-aware tool that performs a fine-grain mapping across different accelerators on-chip.
We show that ODiMO reduces energy/latency by up to 33%/31% with limited accuracy drop (-0.53%/-0.32%) compared to manual mappings.
arXiv Detail & Related papers (2023-06-08T09:23:46Z) - DVFO: Learning-Based DVFS for Energy-Efficient Edge-Cloud Collaborative
Inference [12.095934624748686]
We propose DVFO, a novel DVFS-enabled edge-cloud collaborative inference framework.
It automatically co-optimizes the CPU, GPU and memory frequencies of edge devices, and the feature maps to be offloaded to cloud servers.
It significantly reduces the energy consumption by 33% on average, compared to state-of-the-art schemes.
arXiv Detail & Related papers (2023-06-02T07:00:42Z) - Energy-Efficient Model Compression and Splitting for Collaborative
Inference Over Time-Varying Channels [52.60092598312894]
We propose a technique to reduce the total energy bill at the edge device by utilizing model compression and time-varying model split between the edge and remote nodes.
Our proposed solution results in minimal energy consumption and $CO$ emission compared to the considered baselines.
arXiv Detail & Related papers (2021-06-02T07:36:27Z) - EdgeBERT: Sentence-Level Energy Optimizations for Latency-Aware
Multi-Task NLP Inference [82.1584439276834]
Transformer-based language models such as BERT provide significant accuracy improvement for a multitude of natural language processing (NLP) tasks.
We present EdgeBERT, an in-depth algorithm- hardware co-design for latency-aware energy optimization for multi-task NLP.
arXiv Detail & Related papers (2020-11-28T19:21:47Z) - DRACO: Co-Optimizing Hardware Utilization, and Performance of DNNs on
Systolic Accelerator [5.65116500037191]
We propose data reuse computation aware co-optimization (DRACO)
DRACO improves the PE utilization of memory-bound DNNs without any additional need for dataflow/micro-architecture modifications.
Unlike the previous co-optimization methods, DRACO not only maximizes performance and energy efficiency but also improves the predictive performance of DNNs.
arXiv Detail & Related papers (2020-06-26T17:06:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.