Feature Selection for Learning to Predict Outcomes of Compute Cluster
Jobs with Application to Decision Support
- URL: http://arxiv.org/abs/2012.07982v1
- Date: Mon, 14 Dec 2020 22:35:02 GMT
- Title: Feature Selection for Learning to Predict Outcomes of Compute Cluster
Jobs with Application to Decision Support
- Authors: Adedolapo Okanlawon, Huichen Yang, Avishek Bose, William Hsu, Dan
Andresen, Mohammed Tanash
- Abstract summary: We present a machine learning framework and a new test bed for data mining from the Slurm Workload Manager for high-performance computing clusters.
The focus was to find a method for selecting features to support decisions: helping users decide whether to resubmit failed jobs with boosted CPU and memory allocations or migrate them to a computing cloud.
We present a supervised learning model trained on a Simple Linux Utility for Resource Management (Slurm) data set of HPC jobs using three different techniques for selecting features.
- Score: 7.55043162959755
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a machine learning framework and a new test bed for data mining
from the Slurm Workload Manager for high-performance computing (HPC) clusters.
The focus was to find a method for selecting features to support decisions:
helping users decide whether to resubmit failed jobs with boosted CPU and
memory allocations or migrate them to a computing cloud. This task was cast as
both supervised classification and regression learning, specifically,
sequential problem solving suitable for reinforcement learning. Selecting
relevant features can improve training accuracy, reduce training time, and
produce a more comprehensible model, with an intelligent system that can
explain predictions and inferences. We present a supervised learning model
trained on a Simple Linux Utility for Resource Management (Slurm) data set of
HPC jobs using three different techniques for selecting features: linear
regression, lasso, and ridge regression. Our data set represented both HPC jobs
that failed and those that succeeded, so our model was reliable, less likely to
overfit, and generalizable. Our model achieved an R^2 of 95\% with 99\%
accuracy. We identified five predictors for both CPU and memory properties.
Related papers
- Attribute-to-Delete: Machine Unlearning via Datamodel Matching [65.13151619119782]
Machine unlearning -- efficiently removing a small "forget set" training data on a pre-divertrained machine learning model -- has recently attracted interest.
Recent research shows that machine unlearning techniques do not hold up in such a challenging setting.
arXiv Detail & Related papers (2024-10-30T17:20:10Z) - Adaptive Rentention & Correction for Continual Learning [114.5656325514408]
A common problem in continual learning is the classification layer's bias towards the most recent task.
We name our approach Adaptive Retention & Correction (ARC)
ARC achieves an average performance increase of 2.7% and 2.6% on the CIFAR-100 and Imagenet-R datasets.
arXiv Detail & Related papers (2024-05-23T08:43:09Z) - Cross-Silo Prototypical Calibration for Federated Learning with Non-IID
Data [24.3384892417653]
Federated Learning aims to learn a global model on the server side that generalizes to all clients in a privacy-preserving manner.
To address this issue, this paper presents a cross-silo prototypical calibration method (FedCSPC)
FedCSPC takes additional prototype information from the clients to learn a unified feature space on the server side.
arXiv Detail & Related papers (2023-08-07T10:25:54Z) - Complementary Learning Subnetworks for Parameter-Efficient
Class-Incremental Learning [40.13416912075668]
We propose a rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks.
Our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order.
arXiv Detail & Related papers (2023-06-21T01:43:25Z) - TIDo: Source-free Task Incremental Learning in Non-stationary
Environments [0.0]
Updating a model-based agent to learn new target tasks requires us to store past training data.
Few-shot task incremental learning methods overcome the limitation of labeled target datasets.
We propose a one-shot task incremental learning approach that can adapt to non-stationary source and target tasks.
arXiv Detail & Related papers (2023-01-28T02:19:45Z) - Learning to Optimize Permutation Flow Shop Scheduling via Graph-based
Imitation Learning [70.65666982566655]
Permutation flow shop scheduling (PFSS) is widely used in manufacturing systems.
We propose to train the model via expert-driven imitation learning, which accelerates convergence more stably and accurately.
Our model's network parameters are reduced to only 37% of theirs, and the solution gap of our model towards the expert solutions decreases from 6.8% to 1.3% on average.
arXiv Detail & Related papers (2022-10-31T09:46:26Z) - Deep Regression Unlearning [6.884272840652062]
We introduce deep regression unlearning methods that generalize well and are robust to privacy attacks.
We conduct regression unlearning experiments for computer vision, natural language processing and forecasting applications.
arXiv Detail & Related papers (2022-10-15T05:00:20Z) - A Memory Transformer Network for Incremental Learning [64.0410375349852]
We study class-incremental learning, a training setup in which new classes of data are observed over time for the model to learn from.
Despite the straightforward problem formulation, the naive application of classification models to class-incremental learning results in the "catastrophic forgetting" of previously seen classes.
One of the most successful existing methods has been the use of a memory of exemplars, which overcomes the issue of catastrophic forgetting by saving a subset of past data into a memory bank and utilizing it to prevent forgetting when training future tasks.
arXiv Detail & Related papers (2022-10-10T08:27:28Z) - Machine Unlearning of Features and Labels [72.81914952849334]
We propose first scenarios for unlearning and labels in machine learning models.
Our approach builds on the concept of influence functions and realizes unlearning through closed-form updates of model parameters.
arXiv Detail & Related papers (2021-08-26T04:42:24Z) - Parrot: Data-Driven Behavioral Priors for Reinforcement Learning [79.32403825036792]
We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials.
We show how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
arXiv Detail & Related papers (2020-11-19T18:47:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.