Benchmarking AutoML algorithms on a collection of binary problems
- URL: http://arxiv.org/abs/2212.02704v1
- Date: Tue, 6 Dec 2022 01:53:50 GMT
- Title: Benchmarking AutoML algorithms on a collection of binary problems
- Authors: Pedro Henrique Ribeiro, Patryk Orzechowski, Joost Wagenaar, and Jason
H. Moore
- Abstract summary: This paper compares the performance of four different AutoML algorithms: Tree-based Pipeline Optimization Tool (TPOT), Auto-Sklearn, Auto-Sklearn 2, and H2O AutoML.
We confirm that AutoML can identify pipelines that perform well on all included datasets.
- Score: 3.3793659640122717
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Automated machine learning (AutoML) algorithms have grown in popularity due
to their high performance and flexibility to adapt to different problems and
data sets. With the increasing number of AutoML algorithms, deciding which
would best suit a given problem becomes increasingly more work. Therefore, it
is essential to use complex and challenging benchmarks which would be able to
differentiate the AutoML algorithms from each other. This paper compares the
performance of four different AutoML algorithms: Tree-based Pipeline
Optimization Tool (TPOT), Auto-Sklearn, Auto-Sklearn 2, and H2O AutoML. We use
the Diverse and Generative ML benchmark (DIGEN), a diverse set of synthetic
datasets derived from generative functions designed to highlight the strengths
and weaknesses of the performance of common machine learning algorithms. We
confirm that AutoML can identify pipelines that perform well on all included
datasets. Most AutoML algorithms performed similarly without much room for
improvement; however, some were more consistent than others at finding
high-performing solutions for some datasets.
Related papers
- AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML [56.565200973244146]
Automated machine learning (AutoML) accelerates AI development by automating tasks in the development pipeline.
Recent works have started exploiting large language models (LLM) to lessen such burden.
This paper proposes AutoML-Agent, a novel multi-agent framework tailored for full-pipeline AutoML.
arXiv Detail & Related papers (2024-10-03T20:01:09Z) - Navigating the Labyrinth: Evaluating and Enhancing LLMs' Ability to Reason About Search Problems [59.72548591120689]
We introduce a new benchmark, SearchBench, containing 11 unique search problem types.
We show that even the most advanced LLMs fail to solve these problems end-to-end in text.
Instructing LLMs to generate code that solves the problem helps, but only slightly, e.g., GPT4's performance rises to 11.7%.
arXiv Detail & Related papers (2024-06-18T00:44:58Z) - Automated Machine Learning for Multi-Label Classification [3.2634122554914002]
We devise a novel AutoML approach for single-label classification tasks, consisting of two algorithms at most.
We investigate how well AutoML approaches that form the state of the art for single-label classification tasks scale with the increased problem complexity of AutoML for multi-label classification.
arXiv Detail & Related papers (2024-02-28T09:40:36Z) - AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning [54.47116888545878]
AutoAct is an automatic agent learning framework for QA.
It does not rely on large-scale annotated data and synthetic planning trajectories from closed-source models.
arXiv Detail & Related papers (2024-01-10T16:57:24Z) - A Meta-Level Learning Algorithm for Sequential Hyper-Parameter Space
Reduction in AutoML [2.06188179769701]
We present an algorithm that reduces the space for an AutoML tool with negligible drop in its predictive performance.
SHSR is evaluated on 284 classification and 375 regression problems, showing an approximate 30% reduction in execution time with a performance drop of less than 0.1%.
arXiv Detail & Related papers (2023-12-11T11:26:43Z) - Naive Automated Machine Learning -- A Late Baseline for AutoML [0.0]
Automated Machine Learning (AutoML) is the problem of automatically finding the pipeline with the best generalization performance on some given dataset.
We present Naive AutoML, a very simple solution to AutoML that exploits important meta-knowledge about machine learning problems.
arXiv Detail & Related papers (2021-03-18T19:52:12Z) - An Extensive Experimental Evaluation of Automated Machine Learning
Methods for Recommending Classification Algorithms (Extended Version) [4.400989370979334]
Three of these methods are based on Evolutionary Algorithms (EAs), and the other is Auto-WEKA, a well-known AutoML method.
We performed controlled experiments where these four AutoML methods were given the same runtime limit for different values of this limit.
In general, the difference in predictive accuracy of the three best AutoML methods was not statistically significant.
arXiv Detail & Related papers (2020-09-16T02:36:43Z) - Auto-Sklearn 2.0: Hands-free AutoML via Meta-Learning [45.643809726832764]
We introduce new AutoML approaches motivated by our winning submission to the second ChaLearn AutoML challenge.
We develop PoSH Auto-sklearn, which enables AutoML systems to work well on large datasets under rigid time limits.
We also propose a solution towards truly hands-free AutoML.
arXiv Detail & Related papers (2020-07-08T12:41:03Z) - Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and
Robust AutoDL [53.40030379661183]
Auto-PyTorch is a framework to enable fully automated deep learning (AutoDL)
It combines multi-fidelity optimization with portfolio construction for warmstarting and ensembling of deep neural networks (DNNs)
We show that Auto-PyTorch performs better than several state-of-the-art competitors on average.
arXiv Detail & Related papers (2020-06-24T15:15:17Z) - Is deep learning necessary for simple classification tasks? [3.3793659640122717]
Automated machine learning (AutoML) and deep learning (DL) are two cutting-edge paradigms used to solve inductive learning tasks.
We compare AutoML and DL in the context of binary classification on 6 well-characterized public datasets.
We also evaluate a new tool for genetic programming-based AutoML that incorporates deep estimators.
arXiv Detail & Related papers (2020-06-11T18:41:47Z) - AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data [120.2298620652828]
We introduce AutoGluon-Tabular, an open-source AutoML framework that requires only a single line of Python to train highly accurate machine learning models.
Tests on a suite of 50 classification and regression tasks from Kaggle and the OpenML AutoML Benchmark reveal that AutoGluon is faster, more robust, and much more accurate.
arXiv Detail & Related papers (2020-03-13T23:10:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.