Related papers: Plumber: Diagnosing and Removing Performance Bottlenecks in Machine Learning Data Pipelines

Plumber: Diagnosing and Removing Performance Bottlenecks in Machine Learning Data Pipelines

URL: http://arxiv.org/abs/2111.04131v1
Date: Sun, 7 Nov 2021 17:15:57 GMT
Title: Plumber: Diagnosing and Removing Performance Bottlenecks in Machine Learning Data Pipelines
Authors: Michael Kuchnik and Ana Klimovic and Jiri Simsa and George Amvrosiadis and Virginia Smith
Abstract summary: We propose Plumber, a tool for finding bottlenecks in Machine Learning (ML) input pipelines. Across five representative ML pipelines, Plumber obtains speedups of up to 46x for pipelines. By automating caching, Plumber obtains end-to-end speedups of over 40% compared to state-of-the-art tuners.
Score: 7.022239953701528
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Input pipelines, which ingest and transform input data, are an essential part of training Machine Learning (ML) models. However, it is challenging to implement efficient input pipelines, as it requires reasoning about parallelism, asynchrony, and variability in fine-grained profiling information. Our analysis of over 2 million ML jobs in Google datacenters reveals that a significant fraction of model training jobs could benefit from faster input data pipelines. At the same time, our analysis reveals that most jobs do not saturate host hardware, pointing in the direction of software-based bottlenecks. Motivated by these findings, we propose Plumber, a tool for finding bottlenecks in ML input pipelines. Plumber uses an extensible and interprettable operational analysis analytical model to automatically tune parallelism, prefetching, and caching under host resource constraints. Across five representative ML pipelines, Plumber obtains speedups of up to 46x for misconfigured pipelines. By automating caching, Plumber obtains end-to-end speedups of over 40% compared to state-of-the-art tuners.

Related papers

Instrumentation and Analysis of Native ML Pipelines via Logical Query Plans [3.2362171533623054]
We envision highly-automated software platforms to assist data scientists with developing, validating, monitoring, and analysing their Machine Learning pipelines. We extract "logical query plans" from ML pipeline code relying on popular libraries. Based on these plans, we automatically infer pipeline semantics and instrument and rewrite the ML pipelines to enable diverse use cases without requiring data scientists to manually annotate or rewrite their code.
arXiv Detail & Related papers (2024-07-10T11:35:02Z)
PARTIME: Scalable and Parallel Processing Over Time with Deep Neural Networks [68.96484488899901]
We present PARTIME, a library designed to speed up neural networks whenever data is continuously streamed over time. PARTIME starts processing each data sample at the time in which it becomes available from the stream. Experiments are performed in order to empirically compare PARTIME with classic non-parallel neural computations in online learning.
arXiv Detail & Related papers (2022-10-17T14:49:14Z)
Pushing the Limits of Simple Pipelines for Few-Shot Learning: External Data and Fine-Tuning Make a Difference [74.80730361332711]
Few-shot learning is an important and topical problem in computer vision. We show that a simple transformer-based pipeline yields surprisingly good performance on standard benchmarks.
arXiv Detail & Related papers (2022-04-15T02:55:58Z)
Where Is My Training Bottleneck? Hidden Trade-Offs in Deep Learning Preprocessing Pipelines [77.45213180689952]
Preprocessing pipelines in deep learning aim to provide sufficient data throughput to keep the training processes busy. We introduce a new perspective on efficiently preparing datasets for end-to-end deep learning pipelines. We obtain an increased throughput of 3x to 13x compared to an untuned system.
arXiv Detail & Related papers (2022-02-17T14:31:58Z)
SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines. This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z)
FENXI: Deep-learning Traffic Analytics at the Edge [69.34903175081284]
We present FENXI, a system to run complex analytics by leveraging TPU. FENXI decouples operations and traffic analytics which operates at different granularities. Our analysis shows that FENXI can sustain forwarding line rate traffic processing requiring only limited resources.
arXiv Detail & Related papers (2021-05-25T08:02:44Z)
Production Machine Learning Pipelines: Empirical Analysis and Optimization Opportunities [5.510431861706128]
We analyze provenance graphs of 3000 production ML pipelines at Google, comprising over 450,000 models trained, spanning a period of over four months. Our analysis reveals the characteristics, components, and topologies of typical industry-strength ML pipelines at various granularities. We identify several rich opportunities for optimization, leveraging traditional data management ideas.
arXiv Detail & Related papers (2021-03-30T00:46:29Z)
PipeTransformer: Automated Elastic Pipelining for Distributed Training of Transformers [47.194426122333205]
PipeTransformer is a distributed training algorithm for Transformer models. It automatically adjusts the pipelining and data parallelism by identifying and freezing some layers during the training. We evaluate PipeTransformer using Vision Transformer (ViT) on ImageNet and BERT on GLUE and SQuAD datasets.
arXiv Detail & Related papers (2021-02-05T13:39:31Z)
tf.data: A Machine Learning Data Processing Framework [0.4588028371034406]
Training machine learning models requires feeding input data for models to ingest. We present tf.data, a framework for building and executing efficient input pipelines for machine learning jobs. We demonstrate that input pipeline performance is critical to the end-to-end training time of state-of-the-art machine learning models.
arXiv Detail & Related papers (2021-01-28T17:16:46Z)
AutoWeka4MCPS-AVATAR: Accelerating Automated Machine Learning Pipeline Composition and Optimisation [13.116806430326513]
We propose a novel method to evaluate the validity of ML pipelines, without their execution, using a surrogate model (AVATAR) The AVATAR generates a knowledge base by automatically learning the capabilities and effects of ML algorithms on datasets' characteristics. Instead of executing the original ML pipeline to evaluate its validity, the AVATAR evaluates its surrogate model constructed by capabilities and effects of the ML pipeline components.
arXiv Detail & Related papers (2020-11-21T14:05:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.