A Makespan and Energy-Aware Scheduling Algorithm for Workflows under
Reliability Constraint on a Multiprocessor Platform
- URL: http://arxiv.org/abs/2212.09274v1
- Date: Mon, 19 Dec 2022 07:03:04 GMT
- Title: A Makespan and Energy-Aware Scheduling Algorithm for Workflows under
Reliability Constraint on a Multiprocessor Platform
- Authors: Atharva Tekawade and Suman Banerjee
- Abstract summary: We propose a workflow scheduling algorithm to minimize the makespan and energy for a given reliability constraint.
We show that our algorithms, MERT and EAFTS, outperform the state-of-art approaches.
- Score: 11.427019313284
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many scientific workflows can be modeled as a Directed Acyclic Graph
(henceforth mentioned as DAG) where the nodes represent individual tasks, and
the directed edges represent data and control flow dependency between two
tasks. Due to the large volume of data, multiprocessor systems are often used
to execute these workflows. Hence, scheduling the tasks of a workflow to
achieve certain goals (such as minimizing the makespan, energy, or maximizing
reliability, processor utilization, etc.) remains an active area of research in
embedded systems. In this paper, we propose a workflow scheduling algorithm to
minimize the makespan and energy for a given reliability constraint. If the
reliability constraint is higher, we further propose Energy Aware Fault
Tolerant Scheduling (henceforth mentioned as EAFTS) based on active
replication. Additionally, given that the allocation of task nodes to
processors is known, we develop a frequency allocation algorithm that assigns
frequencies to the processors. Mathematically we show that our algorithms can
work for any satisfiable reliability constraint. We analyze the proposed
solution approaches to understand their time requirements. Experiments with
real-world Workflows show that our algorithms, MERT and EAFTS, outperform the
state-of-art approaches. In particular, we observe that MERT gives 3.12% lesser
energy consumption and 14.14% lesser makespan on average. In the fault-tolerant
setting, our method EAFTS gives 11.11% lesser energy consumption on average
when compared with the state-of-art approaches.
Related papers
- FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression [55.992528247880685]
Decentralized training faces significant challenges regarding system design and efficiency.
We present FusionLLM, a decentralized training system designed and implemented for training large deep neural networks (DNNs)
We show that our system and method can achieve 1.45 - 9.39x speedup compared to baseline methods while ensuring convergence.
arXiv Detail & Related papers (2024-10-16T16:13:19Z) - Benchmarking Agentic Workflow Generation [80.74757493266057]
We introduce WorFBench, a unified workflow generation benchmark with multi-faceted scenarios and intricate graph workflow structures.
We also present WorFEval, a systemic evaluation protocol utilizing subsequence and subgraph matching algorithms.
We observe that the generated can enhance downstream tasks, enabling them to achieve superior performance with less time during inference.
arXiv Detail & Related papers (2024-10-10T12:41:19Z) - Distributed Deep Learning Inference Acceleration using Seamless
Collaboration in Edge Computing [93.67044879636093]
This paper studies inference acceleration using distributed convolutional neural networks (CNNs) in collaborative edge computing.
We design a novel task collaboration scheme in which the overlapping zone of the sub-tasks on secondary edge servers (ESs) is executed on the host ES, named as HALP.
Experimental results show that HALP can accelerate CNN inference in VGG-16 by 1.7-2.0x for a single task and 1.7-1.8x for 4 tasks per batch on GTX 1080TI and JETSON AGX Xavier.
arXiv Detail & Related papers (2022-07-22T18:39:09Z) - A heuristic method for data allocation and task scheduling on
heterogeneous multiprocessor systems under memory constraints [14.681986126866452]
This paper focuses on the data allocation and task scheduling problem under memory constraints.
We propose a tabu search algorithm (TS) which combines several distinguished features.
Experimental results show that the the proposed TS algorithm can obtain relatively high-quality solutions in a reasonable computational time.
arXiv Detail & Related papers (2022-05-09T10:46:08Z) - Over-the-Air Federated Multi-Task Learning via Model Sparsification and
Turbo Compressed Sensing [48.19771515107681]
We propose an over-the-air FMTL framework, where multiple learning tasks deployed on edge devices share a non-orthogonal fading channel under the coordination of an edge server.
In OA-FMTL, the local updates of edge devices are sparsified, compressed, and then sent over the uplink channel in a superimposed fashion.
We analyze the performance of the proposed OA-FMTL framework together with the M-Turbo-CS algorithm.
arXiv Detail & Related papers (2022-05-08T08:03:52Z) - Energy-Efficient Model Compression and Splitting for Collaborative
Inference Over Time-Varying Channels [52.60092598312894]
We propose a technique to reduce the total energy bill at the edge device by utilizing model compression and time-varying model split between the edge and remote nodes.
Our proposed solution results in minimal energy consumption and $CO$ emission compared to the considered baselines.
arXiv Detail & Related papers (2021-06-02T07:36:27Z) - Energy Efficient Edge Computing: When Lyapunov Meets Distributed
Reinforcement Learning [12.845204986571053]
In this work, we study the problem of energy-efficient offloading enabled by edge computing.
In the considered scenario, multiple users simultaneously compete for radio and edge computing resources.
The proposed solution also allows to increase the network's energy efficiency compared to a benchmark approach.
arXiv Detail & Related papers (2021-03-31T11:02:29Z) - Deep Reinforcement Learning for Delay-Oriented IoT Task Scheduling in
Space-Air-Ground Integrated Network [24.022108191145527]
We investigate a computing task scheduling problem in space-air-ground integrated network (SAGIN) for delay-oriented Internet of Things (IoT) services.
In the considered scenario, an unmanned aerial vehicle (UAV) collects computing tasks from IoT devices and then makes online offloading decisions.
Our objective is to design a task scheduling policy that minimizes offloading and computing delay of all tasks given the UAV energy capacity constraint.
arXiv Detail & Related papers (2020-10-04T02:58:03Z) - Dynamic Multi-Robot Task Allocation under Uncertainty and Temporal
Constraints [52.58352707495122]
We present a multi-robot allocation algorithm that decouples the key computational challenges of sequential decision-making under uncertainty and multi-agent coordination.
We validate our results over a wide range of simulations on two distinct domains: multi-arm conveyor belt pick-and-place and multi-drone delivery dispatch in a city.
arXiv Detail & Related papers (2020-05-27T01:10:41Z) - Taskflow: A Lightweight Parallel and Heterogeneous Task Graph Computing
System [12.813275501138193]
Taskflow aims to streamline the building of parallel and heterogeneous applications using a lightweight task graph-based approach.
Our programming model distinguishes itself as a very general class of task graph parallelism with in-graph control flow.
We have demonstrated the promising performance of Taskflow in real-world applications.
arXiv Detail & Related papers (2020-04-23T00:21:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.