GEVO-ML: Optimizing Machine Learning Code with Evolutionary Computation
- URL: http://arxiv.org/abs/2310.10211v1
- Date: Mon, 16 Oct 2023 09:24:20 GMT
- Title: GEVO-ML: Optimizing Machine Learning Code with Evolutionary Computation
- Authors: Jhe-Yu Liou, Stephanie Forrest, Carole-Jean Wu
- Abstract summary: GEVO-ML is a tool for discovering optimization opportunities and tuning the performance of Machine Learning kernels.
We demonstrate GEVO-ML on two different ML workloads for both model training and prediction.
GEVO-ML finds significant improvements for these models, achieving 90.43% performance improvement when model accuracy is relaxed by 2%.
- Score: 6.525197444717069
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Parallel accelerators, such as GPUs, are key enablers for large-scale Machine
Learning (ML) applications. However, ML model developers often lack detailed
knowledge of the underlying system architectures, while system programmers
usually do not have a high-level understanding of the ML model that runs on the
specific system. To mitigate this gap between two relevant aspects of domain
knowledge, this paper proposes GEVO-ML, a tool for automatically discovering
optimization opportunities and tuning the performance of ML kernels, where the
model and training/prediction processes are uniformly represented in a single
intermediate language, the Multiple-Layer Intermediate Representation (MLIR).
GEVO-ML uses multi-objective evolutionary search to find edits (mutations) to
MLIR code that ultimately runs on GPUs, improving performance on desired
criteria while retaining required functionality.
We demonstrate GEVO-ML on two different ML workloads for both model training
and prediction. GEVO-ML finds significant Pareto improvements for these models,
achieving 90.43% performance improvement when model accuracy is relaxed by 2%,
from 91.2% to 89.3%. For the training workloads, GEVO-ML finds a 4.88%
improvement in model accuracy, from 91% to 96%, without sacrificing training or
testing speed. Our analysis of key GEVO-ML mutations reveals diverse code
modifications, while might be foreign to human developers, achieving similar
effects with how human developers improve model design, for example, by
changing learning rates or pruning non-essential layer parameters.
Related papers
- Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance [78.48606021719206]
Mini-InternVL is a series of MLLMs with parameters ranging from 1B to 4B, which achieves 90% of the performance with only 5% of the parameters.
We develop a unified adaptation framework for Mini-InternVL, which enables our models to transfer and outperform specialized models in downstream tasks.
arXiv Detail & Related papers (2024-10-21T17:58:20Z) - EMMA: Efficient Visual Alignment in Multi-Modal LLMs [56.03417732498859]
EMMA is a lightweight cross-modality module designed to efficiently fuse visual and textual encodings.
EMMA boosts performance across multiple tasks by up to 9.3% while significantly improving robustness against hallucinations.
arXiv Detail & Related papers (2024-10-02T23:00:31Z) - CubicML: Automated ML for Large ML Systems Co-design with ML Prediction of Performance [7.425372356516303]
Scaling up deep learning models has been proven effective to improve intelligence of machine learning (ML) models.
In this paper, we propose CubicML which uses ML to automatically optimize training performance of large distributed ML systems.
We prove that CubicML can effectively optimize training speed of in-house recommendation models with 73 billion parameters and large language models up to 405 billion parameters at Meta ads.
arXiv Detail & Related papers (2024-09-06T19:55:21Z) - LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation [41.05687297326706]
LLaVA-MoD is a framework designed to enable the efficient training of small-scale Multimodal Language Models.
We optimize the network structure of s-MLLM by integrating a sparse Mixture of Experts architecture into the language model.
We also propose a progressive knowledge transfer strategy to ensure comprehensive knowledge migration.
arXiv Detail & Related papers (2024-08-28T15:52:23Z) - Verbalized Machine Learning: Revisiting Machine Learning with Language Models [63.10391314749408]
We introduce the framework of verbalized machine learning (VML)
VML constrains the parameter space to be human-interpretable natural language.
We empirically verify the effectiveness of VML, and hope that VML can serve as a stepping stone to stronger interpretability.
arXiv Detail & Related papers (2024-06-06T17:59:56Z) - MLLM-DataEngine: An Iterative Refinement Approach for MLLM [62.30753425449056]
We propose a novel closed-loop system that bridges data generation, model training, and evaluation.
Within each loop, the MLLM-DataEngine first analyze the weakness of the model based on the evaluation results.
For targeting, we propose an Adaptive Bad-case Sampling module, which adjusts the ratio of different types of data.
For quality, we resort to GPT-4 to generate high-quality data with each given data type.
arXiv Detail & Related papers (2023-08-25T01:41:04Z) - An Empirical Study of Multimodal Model Merging [148.48412442848795]
Model merging is a technique that fuses multiple models trained on different tasks to generate a multi-task solution.
We conduct our study for a novel goal where we can merge vision, language, and cross-modal transformers of a modality-specific architecture.
We propose two metrics that assess the distance between weights to be merged and can serve as an indicator of the merging outcomes.
arXiv Detail & Related papers (2023-04-28T15:43:21Z) - Adapted Multimodal BERT with Layer-wise Fusion for Sentiment Analysis [84.12658971655253]
We propose Adapted Multimodal BERT, a BERT-based architecture for multimodal tasks.
adapter adjusts the pretrained language model for the task at hand, while the fusion layers perform task-specific, layer-wise fusion of audio-visual information with textual BERT representations.
In our ablations we see that this approach leads to efficient models, that can outperform their fine-tuned counterparts and are robust to input noise.
arXiv Detail & Related papers (2022-12-01T17:31:42Z) - MLGOPerf: An ML Guided Inliner to Optimize Performance [7.314201117946244]
This paper presents the first end-to-end framework capable of optimizing performance using LLVM's ML-Inliner.
It employs a secondary ML model to generate rewards used for training a retargeted Reinforcement learning agent.
It does so by predicting the post-inlining speedup of a function under analysis and it enables a fast training framework for the primary model.
arXiv Detail & Related papers (2022-07-18T05:47:29Z) - Complementary Ensemble Learning [1.90365714903665]
We derive a technique to improve performance of state-of-the-art deep learning models.
Specifically, we train auxiliary models which are able to complement state-of-the-art model uncertainty.
arXiv Detail & Related papers (2021-11-09T03:23:05Z) - Robusta: Robust AutoML for Feature Selection via Reinforcement Learning [24.24652530951966]
We propose the first robust AutoML framework, Robusta--based on reinforcement learning (RL)
We show that the framework is able to improve the model robustness by up to 22% while maintaining competitive accuracy on benign samples.
arXiv Detail & Related papers (2021-01-15T03:12:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.