Where Is My Training Bottleneck? Hidden Trade-Offs in Deep Learning
Preprocessing Pipelines
- URL: http://arxiv.org/abs/2202.08679v1
- Date: Thu, 17 Feb 2022 14:31:58 GMT
- Title: Where Is My Training Bottleneck? Hidden Trade-Offs in Deep Learning
Preprocessing Pipelines
- Authors: Alexander Isenko, Ruben Mayer, Jeffrey Jedele, Hans-Arno Jacobsen
- Abstract summary: Preprocessing pipelines in deep learning aim to provide sufficient data throughput to keep the training processes busy.
We introduce a new perspective on efficiently preparing datasets for end-to-end deep learning pipelines.
We obtain an increased throughput of 3x to 13x compared to an untuned system.
- Score: 77.45213180689952
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Preprocessing pipelines in deep learning aim to provide sufficient data
throughput to keep the training processes busy. Maximizing resource utilization
is becoming more challenging as the throughput of training processes increases
with hardware innovations (e.g., faster GPUs, TPUs, and inter-connects) and
advanced parallelization techniques that yield better scalability. At the same
time, the amount of training data needed in order to train increasingly complex
models is growing. As a consequence of this development, data preprocessing and
provisioning are becoming a severe bottleneck in end-to-end deep learning
pipelines.
In this paper, we provide an in-depth analysis of data preprocessing
pipelines from four different machine learning domains. We introduce a new
perspective on efficiently preparing datasets for end-to-end deep learning
pipelines and extract individual trade-offs to optimize throughput,
preprocessing time, and storage consumption. Additionally, we provide an
open-source profiling library that can automatically decide on a suitable
preprocessing strategy to maximize throughput. By applying our generated
insights to real-world use-cases, we obtain an increased throughput of 3x to
13x compared to an untuned system while keeping the pipeline functionally
identical. These findings show the enormous potential of data pipeline tuning.
Related papers
- InTune: Reinforcement Learning-based Data Pipeline Optimization for Deep
Recommendation Models [3.7414278978078204]
Deep learning-based recommender models (DLRMs) have become an essential component of many modern recommender systems.
The systems challenges faced in this setting are unique; while typical deep learning training jobs are dominated by model execution, the most important factor in DLRM training performance is often online data ingestion.
arXiv Detail & Related papers (2023-08-13T18:28:56Z) - Deep Pipeline Embeddings for AutoML [11.168121941015015]
AutoML is a promising direction for democratizing AI by automatically deploying Machine Learning systems with minimal human expertise.
Existing Pipeline Optimization techniques fail to explore deep interactions between pipeline stages/components.
This paper proposes a novel neural architecture that captures the deep interaction between the components of a Machine Learning pipeline.
arXiv Detail & Related papers (2023-05-23T12:40:38Z) - Understand Data Preprocessing for Effective End-to-End Training of Deep
Neural Networks [8.977436072381973]
We run experiments to test the performance implications of the two major data preprocessing methods using either raw data or record files.
We identify the potential causes, exercise a variety of optimization methods, and present their pros and cons.
arXiv Detail & Related papers (2023-04-18T11:57:38Z) - Pushing the Limits of Simple Pipelines for Few-Shot Learning: External
Data and Fine-Tuning Make a Difference [74.80730361332711]
Few-shot learning is an important and topical problem in computer vision.
We show that a simple transformer-based pipeline yields surprisingly good performance on standard benchmarks.
arXiv Detail & Related papers (2022-04-15T02:55:58Z) - Knowledge Distillation as Efficient Pre-training: Faster Convergence,
Higher Data-efficiency, and Better Transferability [53.27240222619834]
Knowledge Distillation as Efficient Pre-training aims to efficiently transfer the learned feature representation from pre-trained models to new student models for future downstream tasks.
Our method performs comparably with supervised pre-training counterparts in 3 downstream tasks and 9 downstream datasets requiring 10x less data and 5x less pre-training time.
arXiv Detail & Related papers (2022-03-10T06:23:41Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - Understanding and Co-designing the Data Ingestion Pipeline for
Industry-Scale RecSys Training [5.058493679956239]
We present an extensive characterization of the data ingestion challenges for industry-scale recommendation model training.
First, dataset storage requirements are massive and variable; exceeding local storage capacities.
Secondly, reading and preprocessing data is computationally expensive, requiring substantially more compute, memory, and network resources than are available on trainers themselves.
We introduce Data PreProcessing Service (DPP), a fully disaggregated preprocessing service that scales to hundreds of nodes, eliminating data stalls that can reduce training throughput by 56%.
arXiv Detail & Related papers (2021-08-20T21:09:34Z) - tf.data: A Machine Learning Data Processing Framework [0.4588028371034406]
Training machine learning models requires feeding input data for models to ingest.
We present tf.data, a framework for building and executing efficient input pipelines for machine learning jobs.
We demonstrate that input pipeline performance is critical to the end-to-end training time of state-of-the-art machine learning models.
arXiv Detail & Related papers (2021-01-28T17:16:46Z) - Understanding the Effects of Data Parallelism and Sparsity on Neural
Network Training [126.49572353148262]
We study two factors in neural network training: data parallelism and sparsity.
Despite their promising benefits, understanding of their effects on neural network training remains elusive.
arXiv Detail & Related papers (2020-03-25T10:49:22Z) - Large-Scale Gradient-Free Deep Learning with Recursive Local
Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources.
Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize.
We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.