Related papers: Utilizing Ensemble Learning for Performance and Power Modeling and Improvement of Parallel Cancer Deep Learning CANDLE Benchmarks

Utilizing Ensemble Learning for Performance and Power Modeling and Improvement of Parallel Cancer Deep Learning CANDLE Benchmarks

URL: http://arxiv.org/abs/2011.06654v1
Date: Thu, 12 Nov 2020 21:18:20 GMT
Title: Utilizing Ensemble Learning for Performance and Power Modeling and Improvement of Parallel Cancer Deep Learning CANDLE Benchmarks
Authors: Xingfu Wu and Valerie Taylor
Abstract summary: In this paper, we utilize ensemble learning to combine linear, nonlinear, and tree-/rule-based machine learning methods. We use the datasets collected for two parallel cancer deep learning CANDLE benchmarks, NT3 and P1B2. We achieve up to 61.15% performance improvement and up to 62.58% energy saving for P1B2 and up to 55.81% performance improvement and up to 52.60% energy saving for NT3 on up to 24,576 cores.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Machine learning (ML) continues to grow in importance across nearly all domains and is a natural tool in modeling to learn from data. Often a tradeoff exists between a model's ability to minimize bias and variance. In this paper, we utilize ensemble learning to combine linear, nonlinear, and tree-/rule-based ML methods to cope with the bias-variance tradeoff and result in more accurate models. Hardware performance counter values are correlated with properties of applications that impact performance and power on the underlying system. We use the datasets collected for two parallel cancer deep learning CANDLE benchmarks, NT3 (weak scaling) and P1B2 (strong scaling), to build performance and power models based on hardware performance counters using single-object and multiple-objects ensemble learning to identify the most important counters for improvement. Based on the insights from these models, we improve the performance and energy of P1B2 and NT3 by optimizing the deep learning environments TensorFlow, Keras, Horovod, and Python under the huge page size of 8 MB on the Cray XC40 Theta at Argonne National Laboratory. Experimental results show that ensemble learning not only produces more accurate models but also provides more robust performance counter ranking. We achieve up to 61.15% performance improvement and up to 62.58% energy saving for P1B2 and up to 55.81% performance improvement and up to 52.60% energy saving for NT3 on up to 24,576 cores.

Related papers

Malware Classification from Memory Dumps Using Machine Learning, Transformers, and Large Language Models [1.038088229789127]
This study investigates the performance of various classification models for a malware classification task using different feature sets and data configurations. XGB achieved the highest accuracy of 87.42% using the Top 45 Features, outperforming all other models. Deep learning models underperformed, with RNN achieving 66.71% accuracy and Transformers reaching 71.59%.
arXiv Detail & Related papers (2025-03-04T00:24:21Z)
OmniBal: Towards Fast Instruct-tuning for Vision-Language Models via Omniverse Computation Balance [65.48009829137824]
Large-scale 3D parallel training on vision-language instruct-tuning models leads to an imbalanced computation load across different devices. We rebalanced the computational loads from data, model, and memory perspectives to address this issue. Our method's efficacy and generalizability were further demonstrated across various models and datasets.
arXiv Detail & Related papers (2024-07-30T12:02:58Z)
The More the Merrier? Navigating Accuracy vs. Energy Efficiency Design Trade-Offs in Ensemble Learning Systems [13.462625365349663]
We analyzed three types of design decisions for ensemble learning regarding a potential trade-off between accuracy and energy consumption. We measured accuracy (F1-score) and energy consumption in J (for both training and inference) We recommend designing ensembles of small size (2 or maximum 3 models), using subset-based training, majority voting, and energy-efficient ML algorithms like decision trees, Naive Bayes, or KNN.
arXiv Detail & Related papers (2024-07-03T08:45:17Z)
Green AI: A Preliminary Empirical Study on Energy Consumption in DL Models Across Different Runtime Infrastructures [56.200335252600354]
It is common practice to deploy pre-trained models on environments distinct from their native development settings. This led to the introduction of interchange formats such as ONNX, which includes its infrastructure, and ONNX, which work as standard formats.
arXiv Detail & Related papers (2024-02-21T09:18:44Z)
The Languini Kitchen: Enabling Language Modelling Research at Different Scales of Compute [66.84421705029624]
We introduce an experimental protocol that enables model comparisons based on equivalent compute, measured in accelerator hours. We pre-process an existing large, diverse, and high-quality dataset of books that surpasses existing academic benchmarks in quality, diversity, and document length. This work also provides two baseline models: a feed-forward model derived from the GPT-2 architecture and a recurrent model in the form of a novel LSTM with ten-fold throughput.
arXiv Detail & Related papers (2023-09-20T10:31:17Z)
Robust Learning with Progressive Data Expansion Against Spurious Correlation [65.83104529677234]
We study the learning process of a two-layer nonlinear convolutional neural network in the presence of spurious features. Our analysis suggests that imbalanced data groups and easily learnable spurious features can lead to the dominance of spurious features during the learning process. We propose a new training algorithm called PDE that efficiently enhances the model's robustness for a better worst-group performance.
arXiv Detail & Related papers (2023-06-08T05:44:06Z)
Part-Based Models Improve Adversarial Robustness [57.699029966800644]
We show that combining human prior knowledge with end-to-end learning can improve the robustness of deep neural networks. Our model combines a part segmentation model with a tiny classifier and is trained end-to-end to simultaneously segment objects into parts. Our experiments indicate that these models also reduce texture bias and yield better robustness against common corruptions and spurious correlations.
arXiv Detail & Related papers (2022-09-15T15:41:47Z)
Complementary Ensemble Learning [1.90365714903665]
We derive a technique to improve performance of state-of-the-art deep learning models. Specifically, we train auxiliary models which are able to complement state-of-the-art model uncertainty.
arXiv Detail & Related papers (2021-11-09T03:23:05Z)
DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models [152.29364079385635]
As pre-trained models grow bigger, the fine-tuning process can be time-consuming and computationally expensive. We propose a framework for resource- and parameter-efficient fine-tuning by leveraging the sparsity prior in both weight updates and the final model weights. Our proposed framework, dubbed Dually Sparsity-Embedded Efficient Tuning (DSEE), aims to achieve two key objectives: (i) parameter efficient fine-tuning and (ii) resource-efficient inference.
arXiv Detail & Related papers (2021-10-30T03:29:47Z)
Sparse MoEs meet Efficient Ensembles [49.313497379189315]
We study the interplay of two popular classes of such models: ensembles of neural networks and sparse mixture of experts (sparse MoEs) We present Efficient Ensemble of Experts (E$3$), a scalable and simple ensemble of sparse MoEs that takes the best of both classes of models, while using up to 45% fewer FLOPs than a deep ensemble.
arXiv Detail & Related papers (2021-10-07T11:58:35Z)
Learning to Fairly Classify the Quality of Wireless Links [0.5352699766206808]
We propose a new tree-based link quality classifier that meets high performance and fairly classifies the minority class. We compare the tree-based model, to a multilayer perceptron (MLP) non-linear model and two linear models, namely logistic regression (LR) and SVM, on a selected imbalanced dataset. Our study shows that 1) non-linear models perform slightly better than linear models in general, 2) the proposed non-linear tree-based model yields the best performance trade-off considering F1, training time and fairness, and 3) single metric aggregated evaluations based only on accuracy can hide poor,
arXiv Detail & Related papers (2021-02-23T12:23:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.