Automatic Discovery of Composite SPMD Partitioning Strategies in PartIR
- URL: http://arxiv.org/abs/2210.06352v1
- Date: Fri, 7 Oct 2022 17:46:46 GMT
- Title: Automatic Discovery of Composite SPMD Partitioning Strategies in PartIR
- Authors: Sami Alabed, Dominik Grewe, Juliana Franco, Bart Chrzaszcz, Tom Natan,
Tamara Norman, Norman A. Rink, Dimitrios Vytiniotis, Michael Schaarschmidt
- Abstract summary: We present an automatic partitioner that identifies efficient combinations for many model architectures and accelerator systems.
Our key findings are that a Monte Carlo Tree Search-based partitioner leveraging partition-specific compiler analysis directly into the search and guided goals matches expert-level strategies for various models.
- Score: 1.2507285499419876
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large neural network models are commonly trained through a combination of
advanced parallelism strategies in a single program, multiple data (SPMD)
paradigm. For example, training large transformer models requires combining
data, model, and pipeline partitioning; and optimizer sharding techniques.
However, identifying efficient combinations for many model architectures and
accelerator systems requires significant manual analysis. In this work, we
present an automatic partitioner that identifies these combinations through a
goal-oriented search. Our key findings are that a Monte Carlo Tree Search-based
partitioner leveraging partition-specific compiler analysis directly into the
search and guided goals matches expert-level strategies for various models.
Related papers
- Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity [59.57065228857247]
Retrieval-augmented Large Language Models (LLMs) have emerged as a promising approach to enhancing response accuracy in several tasks, such as Question-Answering (QA)
We propose a novel adaptive QA framework, that can dynamically select the most suitable strategy for (retrieval-augmented) LLMs based on the query complexity.
We validate our model on a set of open-domain QA datasets, covering multiple query complexities, and show that ours enhances the overall efficiency and accuracy of QA systems.
arXiv Detail & Related papers (2024-03-21T13:52:30Z) - PartIR: Composing SPMD Partitioning Strategies for Machine Learning [1.1250231074374903]
We present PartIR, our design for a NN partitioning system.
PartIR is focused on an incremental approach to rewriting and is hardware-and-runtime agnostic.
We evaluate PartIR on several different models to demonstrate its predictability, expressibility, and ability to reach peak performance.
arXiv Detail & Related papers (2024-01-20T10:30:31Z) - Single-Stage Visual Relationship Learning using Conditional Queries [60.90880759475021]
TraCQ is a new formulation for scene graph generation that avoids the multi-task learning problem and the entity pair distribution.
We employ a DETR-based encoder-decoder conditional queries to significantly reduce the entity label space as well.
Experimental results show that TraCQ not only outperforms existing single-stage scene graph generation methods, it also beats many state-of-the-art two-stage methods on the Visual Genome dataset.
arXiv Detail & Related papers (2023-06-09T06:02:01Z) - HKNAS: Classification of Hyperspectral Imagery Based on Hyper Kernel
Neural Architecture Search [104.45426861115972]
We propose to directly generate structural parameters by utilizing the specifically designed hyper kernels.
We obtain three kinds of networks to separately conduct pixel-level or image-level classifications with 1-D or 3-D convolutions.
A series of experiments on six public datasets demonstrate that the proposed methods achieve state-of-the-art results.
arXiv Detail & Related papers (2023-04-23T17:27:40Z) - Automap: Towards Ergonomic Automated Parallelism for ML Models [2.469997094590327]
We present the prototype of an automated partitioner that seamlessly integrates into existing compilers and existing user.
Our partitioner enables SPMD-style parallelism that encompasses data parallelism and parameter/activation sharding.
Through a combination of inductive tactics and search in a platform-independent partitioning IR, automap can recover expert partitioning strategies such as Megatron sharding for transformer layers.
arXiv Detail & Related papers (2021-12-06T12:09:38Z) - DistIR: An Intermediate Representation and Simulator for Efficient
Neural Network Distribution [15.086401550425125]
DistIR is a representation for distributed computation that is tailored for efficient analyses.
We show how DistIR and its simulator enable fast grid searches over complex distribution spaces spanning up to 1000+ configurations.
arXiv Detail & Related papers (2021-11-09T21:32:51Z) - DHA: End-to-End Joint Optimization of Data Augmentation Policy,
Hyper-parameter and Architecture [81.82173855071312]
We propose an end-to-end solution that integrates the AutoML components and returns a ready-to-use model at the end of the search.
Dha achieves state-of-the-art (SOTA) results on various datasets, especially 77.4% accuracy on ImageNet with cell based search space.
arXiv Detail & Related papers (2021-09-13T08:12:50Z) - Redefining Neural Architecture Search of Heterogeneous Multi-Network
Models by Characterizing Variation Operators and Model Components [71.03032589756434]
We investigate the effect of different variation operators in a complex domain, that of multi-network heterogeneous neural models.
We characterize both the variation operators, according to their effect on the complexity and performance of the model; and the models, relying on diverse metrics which estimate the quality of the different parts composing it.
arXiv Detail & Related papers (2021-06-16T17:12:26Z) - Efficient Data-specific Model Search for Collaborative Filtering [56.60519991956558]
Collaborative filtering (CF) is a fundamental approach for recommender systems.
In this paper, motivated by the recent advances in automated machine learning (AutoML), we propose to design a data-specific CF model.
Key here is a new framework that unifies state-of-the-art (SOTA) CF methods and splits them into disjoint stages of input encoding, embedding function, interaction and prediction function.
arXiv Detail & Related papers (2021-06-14T14:30:32Z) - Joint Search of Data Augmentation Policies and Network Architectures [4.887917220146243]
The proposed method combines differentiable methods for augmentation policy search and network architecture search to jointly optimize them in the end-to-end manner.
experimental results show our method achieves competitive or superior performance to the independently searched results.
arXiv Detail & Related papers (2020-12-17T06:09:44Z) - Deep-n-Cheap: An Automated Search Framework for Low Complexity Deep
Learning [3.479254848034425]
We present Deep-n-Cheap -- an open-source AutoML framework to search for deep learning models.
Our framework is targeted for deployment on both benchmark and custom datasets.
Deep-n-Cheap includes a user-customizable complexity penalty which trades off performance with training time or number of parameters.
arXiv Detail & Related papers (2020-03-27T13:00:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.