Related papers: Learning Intermediate Representations using Graph Neural Networks for NUMA and Prefetchers Optimization

Learning Intermediate Representations using Graph Neural Networks for NUMA and Prefetchers Optimization

URL: http://arxiv.org/abs/2203.00611v1
Date: Tue, 1 Mar 2022 16:51:30 GMT
Title: Learning Intermediate Representations using Graph Neural Networks for NUMA and Prefetchers Optimization
Authors: Ali TehraniJamsaz, Mihail Popov, Akash Dutta, Emmanuelle Saillard, Ali Jannesari
Abstract summary: This paper demonstrates how the static Intermediate Representation (IR) of the code can guide NUMA/prefetcher optimizations without the prohibitive cost of performance profiling. We show that our static intermediate representation based model achieves 80% of the performance gains provided by expensive dynamic performance profiling based strategies.
Score: 1.3999481573773074
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: There is a large space of NUMA and hardware prefetcher configurations that can significantly impact the performance of an application. Previous studies have demonstrated how a model can automatically select configurations based on the dynamic properties of the code to achieve speedups. This paper demonstrates how the static Intermediate Representation (IR) of the code can guide NUMA/prefetcher optimizations without the prohibitive cost of performance profiling. We propose a method to create a comprehensive dataset that includes a diverse set of intermediate representations along with optimum configurations. We then apply a graph neural network model in order to validate this dataset. We show that our static intermediate representation based model achieves 80% of the performance gains provided by expensive dynamic performance profiling based strategies. We further develop a hybrid model that uses both static and dynamic information. Our hybrid model achieves the same gains as the dynamic models but at a reduced cost by only profiling 30% of the programs.

Related papers

SMPLest-X: Ultimate Scaling for Expressive Human Pose and Shape Estimation [81.36747103102459]
Expressive human pose and shape estimation (EHPS) unifies body, hands, and face motion capture with numerous applications. Current state-of-the-art methods focus on training innovative architectural designs on confined datasets. We investigate the impact of scaling up EHPS towards a family of generalist foundation models.
arXiv Detail & Related papers (2025-01-16T18:59:46Z)
Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws [59.03420759554073]
We introduce Adaptive Data Optimization (ADO), an algorithm that optimize data distributions in an online fashion, concurrent with model training. ADO does not require external knowledge, proxy models, or modifications to the model update. ADO uses per-domain scaling laws to estimate the learning potential of each domain during training and adjusts the data mixture accordingly.
arXiv Detail & Related papers (2024-10-15T17:47:44Z)
Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis [51.14136878142034]
Point cloud analysis has achieved outstanding performance by transferring point cloud pre-trained models. Existing methods for model adaptation usually update all model parameters, which is inefficient as it relies on high computational costs. In this paper, we aim to study parameter-efficient transfer learning for point cloud analysis with an ideal trade-off between task performance and parameter efficiency.
arXiv Detail & Related papers (2024-03-03T08:25:04Z)
Majority Kernels: An Approach to Leverage Big Model Dynamics for Efficient Small Model Training [32.154166415680066]
Methods like distillation, compression or quantization help leverage the highly performant large models to induce smaller performant ones. This paper explores the hypothesis that a single training run can simultaneously train a larger model for performance and derive a smaller model for deployment.
arXiv Detail & Related papers (2024-02-07T17:07:41Z)
Med-DANet V2: A Flexible Dynamic Architecture for Efficient Medical Volumetric Segmentation [29.082411035685773]
A dynamic architecture network for medical segmentation (i.e. Med-DANet) has achieved a favorable accuracy and efficiency trade-off. This paper explores a unified formulation of the dynamic inference framework from the perspective of both the data itself and the model structure. Our framework improves the model efficiency by up to nearly 4.1 and 17.3 times with comparable segmentation results on BraTS 2019.
arXiv Detail & Related papers (2023-10-28T09:57:28Z)
Quick-Tune: Quickly Learning Which Pretrained Model to Finetune and How [62.467716468917224]
We propose a methodology that jointly searches for the optimal pretrained model and the hyperparameters for finetuning it. Our method transfers knowledge about the performance of many pretrained models on a series of datasets. We empirically demonstrate that our resulting approach can quickly select an accurate pretrained model for a new dataset.
arXiv Detail & Related papers (2023-06-06T16:15:26Z)
MILO: Model-Agnostic Subset Selection Framework for Efficient Model Training and Tuning [68.12870241637636]
We propose MILO, a model-agnostic subset selection framework that decouples the subset selection from model training. Our empirical results indicate that MILO can train models $3times - 10 times$ faster and tune hyperparameters $20times - 75 times$ faster than full-dataset training or tuning without performance.
arXiv Detail & Related papers (2023-01-30T20:59:30Z)
Building Resilience to Out-of-Distribution Visual Data via Input Optimization and Model Finetuning [13.804184845195296]
We propose a preprocessing model that learns to optimise input data for a specific target vision model. We investigate several out-of-distribution scenarios in the context of semantic segmentation for autonomous vehicles. We demonstrate that our approach can enable performance on such data comparable to that of a finetuned model.
arXiv Detail & Related papers (2022-11-29T14:06:35Z)
Prototypical Fine-tuning: Towards Robust Performance Under Varying Data Sizes [47.880781811936345]
We propose a novel framework for fine-tuning pretrained language models (LM) Our prototypical fine-tuning approach can automatically adjust the model capacity according to the number of data points and the model's inherent attributes.
arXiv Detail & Related papers (2022-11-24T14:38:08Z)
Meta-Ensemble Parameter Learning [35.6391802164328]
In this paper, we study if we can utilize the meta-learning strategy to directly predict the parameters of a single model with comparable performance of an ensemble. We introduce WeightFormer, a Transformer-based model that can predict student network weights layer by layer in a forward pass.
arXiv Detail & Related papers (2022-10-05T00:47:24Z)
Data Summarization via Bilevel Optimization [48.89977988203108]
A simple yet powerful approach is to operate on small subsets of data. In this work, we propose a generic coreset framework that formulates the coreset selection as a cardinality-constrained bilevel optimization problem.
arXiv Detail & Related papers (2021-09-26T09:08:38Z)
Dynamic Memory Induction Networks for Few-Shot Text Classification [84.88381813651971]
This paper proposes Dynamic Memory Induction Networks (DMIN) for few-shot text classification. The proposed model achieves new state-of-the-art results on the miniRCV1 and ODIC dataset, improving the best performance (accuracy) by 24%.
arXiv Detail & Related papers (2020-05-12T12:41:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.