Big Data Workload Profiling for Energy-Aware Cloud Resource Management
- URL: http://arxiv.org/abs/2601.11935v1
- Date: Sat, 17 Jan 2026 06:50:51 GMT
- Title: Big Data Workload Profiling for Energy-Aware Cloud Resource Management
- Authors: Milan Parikh, Aniket Abhishek Soni, Sneja Mitinbhai Shah, Ayush Raj Jha,
- Abstract summary: This paper presents a workload aware and energy efficient scheduling framework.<n>It profiles utilization, memory demand, and storage IO behavior to guide virtual machine placement decisions.<n>Results demonstrate consistent energy savings of 15 to 20 percent compared to a baseline scheduler.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cloud data centers face increasing pressure to reduce operational energy consumption as big data workloads continue to grow in scale and complexity. This paper presents a workload aware and energy efficient scheduling framework that profiles CPU utilization, memory demand, and storage IO behavior to guide virtual machine placement decisions. By combining historical execution logs with real time telemetry, the proposed system predicts the energy and performance impact of candidate placements and enables adaptive consolidation while preserving service level agreement compliance. The framework is evaluated using representative Hadoop MapReduce, Spark MLlib, and ETL workloads deployed on a multi node cloud testbed. Experimental results demonstrate consistent energy savings of 15 to 20 percent compared to a baseline scheduler, with negligible performance degradation. These findings highlight workload profiling as a practical and scalable strategy for improving the sustainability of cloud based big data processing environments.
Related papers
- Towards Carbon-Aware Container Orchestration: Predicting Workload Energy Consumption with Federated Learning [8.968986043976532]
We propose a federated learning approach for energy consumption prediction that preserves data privacy by keeping sensitive operational data within individual enterprises.<n>Our framework trains XGBoost models collaboratively across distributed clients using Flower's FedXgbBagging aggregation.<n>This work addresses the unresolved trade-off between data privacy and energy prediction efficiency in prior systems such as Kepler and CASPER.
arXiv Detail & Related papers (2025-10-04T23:01:59Z) - CloudFormer: An Attention-based Performance Prediction for Public Clouds with Unknown Workload [9.00677252346245]
Cloud platforms are increasingly relied upon to host diverse, resource-intensive workloads.<n>Existing management techniques, including VM scheduling and resource provisioning, require accurate performance prediction.<n>We propose CloudFormer, a dual-branch Transformer-based model designed to predict VM performance degradation in black-box environments.
arXiv Detail & Related papers (2025-09-03T15:15:44Z) - Energy-Efficient Federated Learning for Edge Real-Time Vision via Joint Data, Computation, and Communication Design [43.89869891417806]
Real-time computer vision (CV) applications on wireless edge devices demand energy-efficient and privacy-preserving learning.<n>We propose FedDPQ, an ultra energy-efficient FL framework for real-time CV over unreliable wireless networks.
arXiv Detail & Related papers (2025-08-03T13:05:11Z) - Benchmarking of CPU-intensive Stream Data Processing in The Edge Computing Systems [41.19058376513831]
This paper evaluates the power consumption and performance characteristics of a single processing node within an edge cluster using a synthetic microbenchmark.<n>Results show how an optimal measure can lead to optimized usage of edge resources, given both performance and power consumption.
arXiv Detail & Related papers (2025-05-12T17:02:02Z) - Optimized Cloud Resource Allocation Using Genetic Algorithms for Energy Efficiency and QoS Assurance [0.0]
This paper presents a Genetic Algorithm (GA)-based approach for Virtual Machine placement and consolidation.<n>The proposed method dynamically adjusts VM allocation based on real-time workload variations.<n> Experimental results show notable reductions in energy consumption, VM migrations, SLA violation rates, and execution time.
arXiv Detail & Related papers (2025-04-24T15:45:40Z) - Scalable Federated Unlearning via Isolated and Coded Sharding [76.12847512410767]
Federated unlearning has emerged as a promising paradigm to erase the client-level data effect.
This paper proposes a scalable federated unlearning framework based on isolated sharding and coded computing.
arXiv Detail & Related papers (2024-01-29T08:41:45Z) - Sustainable AIGC Workload Scheduling of Geo-Distributed Data Centers: A
Multi-Agent Reinforcement Learning Approach [48.18355658448509]
Recent breakthroughs in generative artificial intelligence have triggered a surge in demand for machine learning training, which poses significant cost burdens and environmental challenges due to its substantial energy consumption.
Scheduling training jobs among geographically distributed cloud data centers unveils the opportunity to optimize the usage of computing capacity powered by inexpensive and low-carbon energy.
We propose an algorithm based on multi-agent reinforcement learning and actor-critic methods to learn the optimal collaborative scheduling strategy through interacting with a cloud system built with real-life workload patterns, energy prices, and carbon intensities.
arXiv Detail & Related papers (2023-04-17T02:12:30Z) - Balancing Performance and Energy Consumption of Bagging Ensembles for
the Classification of Data Streams in Edge Computing [9.801387036837871]
Edge Computing (EC) has emerged as an enabling factor for developing technologies like the Internet of Things (IoT) and 5G networks.
This work investigates strategies for optimizing the performance and energy consumption of bagging ensembles to classify data streams.
arXiv Detail & Related papers (2022-01-17T04:12:18Z) - Reproducible Performance Optimization of Complex Applications on the
Edge-to-Cloud Continuum [55.6313942302582]
We propose a methodology to support the optimization of real-life applications on the Edge-to-Cloud Continuum.
Our approach relies on a rigorous analysis of possible configurations in a controlled testbed environment to understand their behaviour.
Our methodology can be generalized to other applications in the Edge-to-Cloud Continuum.
arXiv Detail & Related papers (2021-08-04T07:35:14Z) - Performance and Energy-Aware Bi-objective Tasks Scheduling for Cloud
Data Centers [0.0]
Cloud computing enables remote execution of users tasks.
The pervasive adoption of cloud computing in smart cities services and applications requires timely execution of tasks adhering to Quality of Services (QoS)
The increasing use of computing servers exacerbates the issues of high energy consumption, operating costs, and environmental pollution.
We propose a performance and energy optimization bi-objective algorithm to tradeoff the contradicting performance and energy objectives.
arXiv Detail & Related papers (2021-04-25T08:55:57Z) - A Framework for Energy and Carbon Footprint Analysis of Distributed and
Federated Edge Learning [48.63610479916003]
This article breaks down and analyzes the main factors that influence the environmental footprint of distributed learning policies.
It models both vanilla and decentralized FL policies driven by consensus.
Results show that FL allows remarkable end-to-end energy savings (30%-40%) for wireless systems characterized by low bit/Joule efficiency.
arXiv Detail & Related papers (2021-03-18T16:04:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.