Learning Intermediate Representations using Graph Neural Networks for
NUMA and Prefetchers Optimization
- URL: http://arxiv.org/abs/2203.00611v1
- Date: Tue, 1 Mar 2022 16:51:30 GMT
- Title: Learning Intermediate Representations using Graph Neural Networks for
NUMA and Prefetchers Optimization
- Authors: Ali TehraniJamsaz, Mihail Popov, Akash Dutta, Emmanuelle Saillard, Ali
Jannesari
- Abstract summary: This paper demonstrates how the static Intermediate Representation (IR) of the code can guide NUMA/prefetcher optimizations without the prohibitive cost of performance profiling.
We show that our static intermediate representation based model achieves 80% of the performance gains provided by expensive dynamic performance profiling based strategies.
- Score: 1.3999481573773074
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There is a large space of NUMA and hardware prefetcher configurations that
can significantly impact the performance of an application. Previous studies
have demonstrated how a model can automatically select configurations based on
the dynamic properties of the code to achieve speedups. This paper demonstrates
how the static Intermediate Representation (IR) of the code can guide
NUMA/prefetcher optimizations without the prohibitive cost of performance
profiling. We propose a method to create a comprehensive dataset that includes
a diverse set of intermediate representations along with optimum
configurations. We then apply a graph neural network model in order to validate
this dataset. We show that our static intermediate representation based model
achieves 80% of the performance gains provided by expensive dynamic performance
profiling based strategies. We further develop a hybrid model that uses both
static and dynamic information. Our hybrid model achieves the same gains as the
dynamic models but at a reduced cost by only profiling 30% of the programs.
Related papers
- Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws [59.03420759554073]
We introduce Adaptive Data Optimization (ADO), an algorithm that optimize data distributions in an online fashion, concurrent with model training.
ADO does not require external knowledge, proxy models, or modifications to the model update.
ADO uses per-domain scaling laws to estimate the learning potential of each domain during training and adjusts the data mixture accordingly.
arXiv Detail & Related papers (2024-10-15T17:47:44Z) - Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis [51.14136878142034]
Point cloud analysis has achieved outstanding performance by transferring point cloud pre-trained models.
Existing methods for model adaptation usually update all model parameters, which is inefficient as it relies on high computational costs.
In this paper, we aim to study parameter-efficient transfer learning for point cloud analysis with an ideal trade-off between task performance and parameter efficiency.
arXiv Detail & Related papers (2024-03-03T08:25:04Z) - Simulated Overparameterization [35.12611686956487]
We introduce a novel paradigm called Simulated Overparametrization ( SOP)
SOP proposes a unique approach to model training and inference, where a model with a significantly larger number of parameters is trained in such a way as a smaller, efficient subset of these parameters is used for the actual computation during inference.
We present a novel, architecture agnostic algorithm called "majority kernels", which seamlessly integrates with predominant architectures, including Transformer models.
arXiv Detail & Related papers (2024-02-07T17:07:41Z) - Med-DANet V2: A Flexible Dynamic Architecture for Efficient Medical
Volumetric Segmentation [29.082411035685773]
A dynamic architecture network for medical segmentation (i.e. Med-DANet) has achieved a favorable accuracy and efficiency trade-off.
This paper explores a unified formulation of the dynamic inference framework from the perspective of both the data itself and the model structure.
Our framework improves the model efficiency by up to nearly 4.1 and 17.3 times with comparable segmentation results on BraTS 2019.
arXiv Detail & Related papers (2023-10-28T09:57:28Z) - Quick-Tune: Quickly Learning Which Pretrained Model to Finetune and How [62.467716468917224]
We propose a methodology that jointly searches for the optimal pretrained model and the hyperparameters for finetuning it.
Our method transfers knowledge about the performance of many pretrained models on a series of datasets.
We empirically demonstrate that our resulting approach can quickly select an accurate pretrained model for a new dataset.
arXiv Detail & Related papers (2023-06-06T16:15:26Z) - MILO: Model-Agnostic Subset Selection Framework for Efficient Model
Training and Tuning [68.12870241637636]
We propose MILO, a model-agnostic subset selection framework that decouples the subset selection from model training.
Our empirical results indicate that MILO can train models $3times - 10 times$ faster and tune hyperparameters $20times - 75 times$ faster than full-dataset training or tuning without performance.
arXiv Detail & Related papers (2023-01-30T20:59:30Z) - Building Resilience to Out-of-Distribution Visual Data via Input
Optimization and Model Finetuning [13.804184845195296]
We propose a preprocessing model that learns to optimise input data for a specific target vision model.
We investigate several out-of-distribution scenarios in the context of semantic segmentation for autonomous vehicles.
We demonstrate that our approach can enable performance on such data comparable to that of a finetuned model.
arXiv Detail & Related papers (2022-11-29T14:06:35Z) - Prototypical Fine-tuning: Towards Robust Performance Under Varying Data
Sizes [47.880781811936345]
We propose a novel framework for fine-tuning pretrained language models (LM)
Our prototypical fine-tuning approach can automatically adjust the model capacity according to the number of data points and the model's inherent attributes.
arXiv Detail & Related papers (2022-11-24T14:38:08Z) - Meta-Ensemble Parameter Learning [35.6391802164328]
In this paper, we study if we can utilize the meta-learning strategy to directly predict the parameters of a single model with comparable performance of an ensemble.
We introduce WeightFormer, a Transformer-based model that can predict student network weights layer by layer in a forward pass.
arXiv Detail & Related papers (2022-10-05T00:47:24Z) - Data Summarization via Bilevel Optimization [48.89977988203108]
A simple yet powerful approach is to operate on small subsets of data.
In this work, we propose a generic coreset framework that formulates the coreset selection as a cardinality-constrained bilevel optimization problem.
arXiv Detail & Related papers (2021-09-26T09:08:38Z) - Dynamic Memory Induction Networks for Few-Shot Text Classification [84.88381813651971]
This paper proposes Dynamic Memory Induction Networks (DMIN) for few-shot text classification.
The proposed model achieves new state-of-the-art results on the miniRCV1 and ODIC dataset, improving the best performance (accuracy) by 24%.
arXiv Detail & Related papers (2020-05-12T12:41:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.