A Generic Performance Model for Deep Learning in a Distributed
Environment
- URL: http://arxiv.org/abs/2305.11665v1
- Date: Fri, 19 May 2023 13:30:34 GMT
- Title: A Generic Performance Model for Deep Learning in a Distributed
Environment
- Authors: Tulasi Kavarakuntla, Liangxiu Han, Huw Lloyd, Annabel Latham, Anthony
Kleerekoper, Samson B. Akintoye
- Abstract summary: We propose a generic performance model of an application in a distributed environment with a generic expression of the application execution time.
We have evaluated the proposed model on three deep learning frameworks (i.e., MXnet, and Pytorch)
- Score: 0.7829352305480285
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Performance modelling of a deep learning application is essential to improve
and quantify the efficiency of the model framework. However, existing
performance models are mostly case-specific, with limited capability for the
new deep learning frameworks/applications. In this paper, we propose a generic
performance model of an application in a distributed environment with a generic
expression of the application execution time that considers the influence of
both intrinsic factors/operations (e.g. algorithmic parameters/internal
operations) and extrinsic scaling factors (e.g. the number of processors, data
chunks and batch size). We formulate it as a global optimization problem and
solve it using regularization on a cost function and differential evolution
algorithm to find the best-fit values of the constants in the generic
expression to match the experimentally determined computation time. We have
evaluated the proposed model on three deep learning frameworks (i.e.,
TensorFlow, MXnet, and Pytorch). The experimental results show that the
proposed model can provide accurate performance predictions and
interpretability. In addition, the proposed work can be applied to any
distributed deep neural network without instrumenting the code and provides
insight into the factors affecting performance and scalability.
Related papers
- Revisiting SMoE Language Models by Evaluating Inefficiencies with Task Specific Expert Pruning [78.72226641279863]
Sparse Mixture of Expert (SMoE) models have emerged as a scalable alternative to dense models in language modeling.
Our research explores task-specific model pruning to inform decisions about designing SMoE architectures.
We introduce an adaptive task-aware pruning technique UNCURL to reduce the number of experts per MoE layer in an offline manner post-training.
arXiv Detail & Related papers (2024-09-02T22:35:03Z) - In2Core: Leveraging Influence Functions for Coreset Selection in Instruction Finetuning of Large Language Models [37.45103473809928]
We propose the In2Core algorithm, which selects a coreset by analyzing the correlation between training and evaluation samples with a trained model.
By applying our algorithm to instruction fine-tuning data of LLMs, we can achieve similar performance with just 50% of the training data.
arXiv Detail & Related papers (2024-08-07T05:48:05Z) - Learning Generalizable Program and Architecture Representations for Performance Modeling [0.3277163122167434]
PerfVec is a novel deep learning-based performance modeling framework.
It learns high-dimensional and independent/orthogonal program and microarchitecture representations.
PerfVec yields a foundation model that captures the performance essence of instructions.
arXiv Detail & Related papers (2023-10-25T17:24:01Z) - Towards Compute-Optimal Transfer Learning [82.88829463290041]
We argue that zero-shot structured pruning of pretrained models allows them to increase compute efficiency with minimal reduction in performance.
Our results show that pruning convolutional filters of pretrained models can lead to more than 20% performance improvement in low computational regimes.
arXiv Detail & Related papers (2023-04-25T21:49:09Z) - Analyzing the Performance of Deep Encoder-Decoder Networks as Surrogates
for a Diffusion Equation [0.0]
We study the use of encoder-decoder convolutional neural network (CNN) as surrogates for steady-state diffusion solvers.
Our results indicate that increasing the size of the training set has a substantial effect on reducing performance fluctuations and overall error.
arXiv Detail & Related papers (2023-02-07T22:53:19Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Using Graph Neural Networks to model the performance of Deep Neural
Networks [2.1151356984322307]
We develop a novel performance model that adopts a graph representation.
Experimental evaluation shows a 7:75x and 12x reduction in prediction error compared to the Halide and TVM models, respectively.
arXiv Detail & Related papers (2021-08-27T20:20:17Z) - Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual
Model-Based Reinforcement Learning [109.74041512359476]
We study a number of design decisions for the predictive model in visual MBRL algorithms.
We find that a range of design decisions that are often considered crucial, such as the use of latent spaces, have little effect on task performance.
We show how this phenomenon is related to exploration and how some of the lower-scoring models on standard benchmarks will perform the same as the best-performing models when trained on the same training data.
arXiv Detail & Related papers (2020-12-08T18:03:21Z) - Self Normalizing Flows [65.73510214694987]
We propose a flexible framework for training normalizing flows by replacing expensive terms in the gradient by learned approximate inverses at each layer.
This reduces the computational complexity of each layer's exact update from $mathcalO(D3)$ to $mathcalO(D2)$.
We show experimentally that such models are remarkably stable and optimize to similar data likelihood values as their exact gradient counterparts.
arXiv Detail & Related papers (2020-11-14T09:51:51Z) - An Advance on Variable Elimination with Applications to Tensor-Based
Computation [11.358487655918676]
We present new results on the classical algorithm of variable elimination, which underlies many algorithms including for probabilistic inference.
The results relate to exploiting functional dependencies, allowing one to perform inference and learning efficiently on models that have very large treewidth.
arXiv Detail & Related papers (2020-02-21T14:17:44Z) - Rethinking Generalization of Neural Models: A Named Entity Recognition
Case Study [81.11161697133095]
We take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives.
Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models.
As a by-product of this paper, we have open-sourced a project that involves a comprehensive summary of recent NER papers.
arXiv Detail & Related papers (2020-01-12T04:33:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.