Designing Machine Learning Pipeline Toolkit for AutoML Surrogate
Modeling Optimization
- URL: http://arxiv.org/abs/2107.01253v1
- Date: Fri, 2 Jul 2021 20:06:40 GMT
- Title: Designing Machine Learning Pipeline Toolkit for AutoML Surrogate
Modeling Optimization
- Authors: Paulito P. Palmes, Akihiro Kishimoto, Radu Marinescu, Parikshit Ram,
Elizabeth Daly
- Abstract summary: We create the AMLP toolkit which facilitates the creation and evaluation of complex machine learning pipeline structures.
We use AMLP to find optimal pipeline signatures, datamine them, and use these datamined features to speed-up learning and prediction.
We formulated a two-stage pipeline optimization with surrogate modeling in AMLP which outperforms other AutoML approaches with a 4-hour time budget in less than 5 minutes of AMLP computation time.
- Score: 18.82755278152806
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The pipeline optimization problem in machine learning requires simultaneous
optimization of pipeline structures and parameter adaptation of their elements.
Having an elegant way to express these structures can help lessen the
complexity in the management and analysis of their performances together with
the different choices of optimization strategies. With these issues in mind, we
created the AMLP toolkit which facilitates the creation and evaluation of
complex machine learning pipeline structures using simple expressions. We use
AMLP to find optimal pipeline signatures, datamine them, and use these
datamined features to speed-up learning and prediction. We formulated a
two-stage pipeline optimization with surrogate modeling in AMLP which
outperforms other AutoML approaches with a 4-hour time budget in less than 5
minutes of AMLP computation time.
Related papers
- LLMs for Cold-Start Cutting Plane Separator Configuration [19.931643536607737]
Mixed integer linear programming solvers ship with a staggering number of parameters that are challenging to select a priori for all but expert optimization users.
Existing machine learning approaches to configure solvers require training ML models by solving thousands of related MILP instances, generalize poorly to new problem sizes, and often require implementing complex ML pipelines and custom solver interfaces.
We present a new LLM-based framework to configure which cutting plane separators to use for a given MILP problem with little to no training data based on characteristics of the instance.
arXiv Detail & Related papers (2024-12-16T18:03:57Z) - LLM-based Optimization of Compound AI Systems: A Survey [64.39860384538338]
In a compound AI system, components such as an LLM call, a retriever, a code interpreter, or tools are interconnected.
Recent advancements enable end-to-end optimization of these parameters using an LLM.
This paper presents a survey of the principles and emerging trends in LLM-based optimization of compound AI systems.
arXiv Detail & Related papers (2024-10-21T18:06:25Z) - AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML [56.565200973244146]
Automated machine learning (AutoML) accelerates AI development by automating tasks in the development pipeline.
Recent works have started exploiting large language models (LLM) to lessen such burden.
This paper proposes AutoML-Agent, a novel multi-agent framework tailored for full-pipeline AutoML.
arXiv Detail & Related papers (2024-10-03T20:01:09Z) - LLM as a Complementary Optimizer to Gradient Descent: A Case Study in Prompt Tuning [69.95292905263393]
We show that gradient-based and high-level LLMs can effectively collaborate a combined optimization framework.
In this paper, we show that these complementary to each other and can effectively collaborate a combined optimization framework.
arXiv Detail & Related papers (2024-05-30T06:24:14Z) - eTOP: Early Termination of Pipelines for Faster Training of AutoML
Systems [12.933957727351666]
Finding the right AI/ML model is a complex and costly process.
We propose eTOP Framework which works on top of any AutoML system.
arXiv Detail & Related papers (2023-04-17T20:22:30Z) - AutoEn: An AutoML method based on ensembles of predefined Machine
Learning pipelines for supervised Traffic Forecasting [1.6242924916178283]
Traffic Forecasting (TF) is gaining relevance due to its ability to mitigate traffic congestion by forecasting future traffic states.
TF poses one big challenge to the Machine Learning paradigm, known as the Model Selection Problem (MSP)
We introduce AutoEn, which is a simple and efficient method for automatically generating multi-classifier ensembles from a predefined set of ML pipelines.
arXiv Detail & Related papers (2023-03-19T18:37:18Z) - VeLO: Training Versatile Learned Optimizers by Scaling Up [67.90237498659397]
We leverage the same scaling approach behind the success of deep learning to learn versatiles.
We train an ingest for deep learning which is itself a small neural network that ingests and outputs parameter updates.
We open source our learned, meta-training code, the associated train test data, and an extensive benchmark suite with baselines at velo-code.io.
arXiv Detail & Related papers (2022-11-17T18:39:07Z) - Automated Evolutionary Approach for the Design of Composite Machine
Learning Pipelines [48.7576911714538]
The proposed approach is aimed to automate the design of composite machine learning pipelines.
It designs the pipelines with a customizable graph-based structure, analyzes the obtained results, and reproduces them.
The software implementation on this approach is presented as an open-source framework.
arXiv Detail & Related papers (2021-06-26T23:19:06Z) - Incremental Search Space Construction for Machine Learning Pipeline
Synthesis [4.060731229044571]
Automated machine learning (AutoML) aims for constructing machine learning (ML) pipelines automatically.
We propose a data-centric approach based on meta-features for pipeline construction.
We prove the effectiveness and competitiveness of our approach on 28 data sets used in well-established AutoML benchmarks.
arXiv Detail & Related papers (2021-01-26T17:17:49Z) - AutoWeka4MCPS-AVATAR: Accelerating Automated Machine Learning Pipeline
Composition and Optimisation [13.116806430326513]
We propose a novel method to evaluate the validity of ML pipelines, without their execution, using a surrogate model (AVATAR)
The AVATAR generates a knowledge base by automatically learning the capabilities and effects of ML algorithms on datasets' characteristics.
Instead of executing the original ML pipeline to evaluate its validity, the AVATAR evaluates its surrogate model constructed by capabilities and effects of the ML pipeline components.
arXiv Detail & Related papers (2020-11-21T14:05:49Z) - Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and
Robust AutoDL [53.40030379661183]
Auto-PyTorch is a framework to enable fully automated deep learning (AutoDL)
It combines multi-fidelity optimization with portfolio construction for warmstarting and ensembling of deep neural networks (DNNs)
We show that Auto-PyTorch performs better than several state-of-the-art competitors on average.
arXiv Detail & Related papers (2020-06-24T15:15:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.