Related papers: A Constraint Programming-based Job Dispatcher for Modern HPC Systems and Applications

A Constraint Programming-based Job Dispatcher for Modern HPC Systems and Applications

URL: http://arxiv.org/abs/2009.10348v2
Date: Mon, 28 Sep 2020 20:28:03 GMT
Title: A Constraint Programming-based Job Dispatcher for Modern HPC Systems and Applications
Authors: Cristian Galleguillos, Zeynep Kiziltan, Ricardo Soto
Abstract summary: We present a new CP-based on-line job dispatcher for modern HPC systems and applications. Unlike its predecessors, our new dispatcher tackles the entire problem in CP and its model size is independent of the system size.
Score: 2.022078407932399
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Constraint Programming (CP) is a well-established area in AI as a programming paradigm for modelling and solving discrete optimization problems, and it has been been successfully applied to tackle the on-line job dispatching problem in HPC systems including those running modern applications. The limitations of the available CP-based job dispatchers may hinder their practical use in today's systems that are becoming larger in size and more demanding in resource allocation. In an attempt to bring basic AI research closer to a deployed application, we present a new CP-based on-line job dispatcher for modern HPC systems and applications. Unlike its predecessors, our new dispatcher tackles the entire problem in CP and its model size is independent of the system size. Experimental results based on a simulation study show that with our approach dispatching performance increases significantly in a large system and in a system where allocation is nontrivial.

Related papers

Efficient Domain Adaptation of Multimodal Embeddings using Constrastive Learning [0.08192907805418582]
Current approaches either yield subpar results when using pretrained models without task-specific adaptation, or require substantial computational resources for fine-tuning. We propose a novel method for adapting foundational, multimodal embeddings to downstream tasks, without the need of expensive fine-tuning processes.
arXiv Detail & Related papers (2025-02-04T06:30:12Z)
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization [50.485788083202124]
Reinforcement Learning (RL) plays a crucial role in aligning large language models with human preferences and improving their ability to perform complex tasks. We introduce Direct Q-function Optimization (DQO), which formulates the response generation process as a Markov Decision Process (MDP) and utilizes the soft actor-critic (SAC) framework to optimize a Q-function directly parameterized by the language model. Experimental results on two math problem-solving datasets, GSM8K and MATH, demonstrate that DQO outperforms previous methods, establishing it as a promising offline reinforcement learning approach for aligning language models.
arXiv Detail & Related papers (2024-10-11T23:29:20Z)
Towards Single-System Illusion in Software-Defined Vehicles -- Automated, AI-Powered Workflow [3.2821049498759094]
We propose a novel model- and feature-based approach to development of vehicle software systems. One of the key points of the presented approach is the inclusion of modern generative AI, specifically Large Language Models (LLMs) The resulting pipeline is automated to a large extent, with feedback being generated at each step.
arXiv Detail & Related papers (2024-03-21T15:07:57Z)
Age-Based Scheduling for Mobile Edge Computing: A Deep Reinforcement Learning Approach [58.911515417156174]
We propose a new definition of Age of Information (AoI) and, based on the redefined AoI, we formulate an online AoI problem for MEC systems. We introduce Post-Decision States (PDSs) to exploit the partial knowledge of the system's dynamics. We also combine PDSs with deep RL to further improve the algorithm's applicability, scalability, and robustness.
arXiv Detail & Related papers (2023-12-01T01:30:49Z)
An End-to-End Reinforcement Learning Approach for Job-Shop Scheduling Problems Based on Constraint Programming [5.070542698701157]
This paper proposes a novel end-to-end approach to solving scheduling problems by means of CP and Reinforcement Learning (RL) Our approach leverages existing CP solvers to train an agent learning a Priority Dispatching Rule (PDR) that generalizes well to large instances, even from separate datasets.
arXiv Detail & Related papers (2023-06-09T08:24:56Z)
ALT: An Automatic System for Long Tail Scenario Modeling [15.76033166478158]
We present an automatic system named ALT to deal with this problem. Several efforts are taken to improve the algorithms used in our system, such as employing various automatic machine learning related techniques. To build the system, many optimizations are performed from a systematic perspective, and essential modules are armed.
arXiv Detail & Related papers (2023-05-19T02:35:39Z)
MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms. Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return. We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z)
Human-in-the-Loop Large-Scale Predictive Maintenance of Workstations [89.51621054382878]
Predictive maintenance (PdM) is the task of scheduling maintenance operations based on a statistical analysis of the system's condition. We propose a human-in-the-loop PdM approach in which a machine learning system predicts future problems in sets of workstations.
arXiv Detail & Related papers (2022-06-23T09:40:46Z)
MCDS: AI Augmented Workflow Scheduling in Mobile Edge Cloud Computing Systems [12.215537834860699]
Recently proposed scheduling methods leverage the low response times of edge computing platforms to optimize application Quality of Service (QoS) We propose MCDS: Monte Carlo Learning using Deep Surrogate Models to efficiently schedule workflow applications in mobile edge-cloud computing systems.
arXiv Detail & Related papers (2021-12-14T10:00:01Z)
Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems. Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC. We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z)
A Novel Multi-Agent System for Complex Scheduling Problems [2.294014185517203]
This paper is the conception and implementation of a multi-agent system that is applicable in various problem domains. We simulate a NP-hard scheduling problem to demonstrate the validity of our approach. This paper highlights the advantages of the agent-based approach, like the reduction in layout complexity, improved control of complicated systems, and extendability.
arXiv Detail & Related papers (2020-04-20T14:04:58Z)
Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL. We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.