AMLB: an AutoML Benchmark
- URL: http://arxiv.org/abs/2207.12560v2
- Date: Thu, 16 Nov 2023 14:12:10 GMT
- Title: AMLB: an AutoML Benchmark
- Authors: Pieter Gijsbers, Marcos L. P. Bueno, Stefan Coors, Erin LeDell,
S\'ebastien Poirier, Janek Thomas, Bernd Bischl, Joaquin Vanschoren
- Abstract summary: We conduct a thorough comparison of 9 well-known AutoML frameworks across 71 classification and 33 regression tasks.
The benchmark comes with an open-source tool that integrates with many AutoML frameworks and automates the empirical evaluation process end-to-end.
- Score: 9.642136611591578
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Comparing different AutoML frameworks is notoriously challenging and often
done incorrectly. We introduce an open and extensible benchmark that follows
best practices and avoids common mistakes when comparing AutoML frameworks. We
conduct a thorough comparison of 9 well-known AutoML frameworks across 71
classification and 33 regression tasks. The differences between the AutoML
frameworks are explored with a multi-faceted analysis, evaluating model
accuracy, its trade-offs with inference time, and framework failures. We also
use Bradley-Terry trees to discover subsets of tasks where the relative AutoML
framework rankings differ. The benchmark comes with an open-source tool that
integrates with many AutoML frameworks and automates the empirical evaluation
process end-to-end: from framework installation and resource allocation to
in-depth evaluation. The benchmark uses public data sets, can be easily
extended with other AutoML frameworks and tasks, and has a website with
up-to-date results.
Related papers
- AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML [56.565200973244146]
Automated machine learning (AutoML) accelerates AI development by automating tasks in the development pipeline.
Recent works have started exploiting large language models (LLM) to lessen such burden.
This paper proposes AutoML-Agent, a novel multi-agent framework tailored for full-pipeline AutoML.
arXiv Detail & Related papers (2024-10-03T20:01:09Z) - Task Me Anything [72.810309406219]
This paper produces a benchmark tailored to a user's needs.
It contains 113K images, 10K videos, 2K 3D object assets, over 365 object categories, 655 attributes, and 335 relationships.
It can generate 750M image/video question-answering pairs, which focus on evaluating perceptual capabilities.
arXiv Detail & Related papers (2024-06-17T17:32:42Z) - Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation [65.16137964758612]
We explore the use of long-context capabilities in large language models to create synthetic reading comprehension data from entire books.
Our objective is to test the capabilities of LLMs to analyze, understand, and reason over problems that require a detailed comprehension of long spans of text.
arXiv Detail & Related papers (2024-05-31T20:15:10Z) - Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM
Evaluation [51.99752147380505]
This paper presents a benchmark self-evolving framework to dynamically evaluate Large Language Models (LLMs)
We utilize a multi-agent system to manipulate the context or question of original instances, reframing new evolving instances with high confidence.
Our framework widens performance discrepancies both between different models and within the same model across various tasks.
arXiv Detail & Related papers (2024-02-18T03:40:06Z) - InfiMM-Eval: Complex Open-Ended Reasoning Evaluation For Multi-Modal
Large Language Models [50.03163753638256]
Multi-modal Large Language Models (MLLMs) are increasingly prominent in the field of artificial intelligence.
Our benchmark comprises three key reasoning categories: deductive, abductive, and analogical reasoning.
We evaluate a selection of representative MLLMs using this rigorously developed open-ended multi-step elaborate reasoning benchmark.
arXiv Detail & Related papers (2023-11-20T07:06:31Z) - Bringing Quantum Algorithms to Automated Machine Learning: A Systematic
Review of AutoML Frameworks Regarding Extensibility for QML Algorithms [1.4469725791865982]
This work describes the selection approach and analysis of existing AutoML frameworks regarding their capability of incorporating Quantum Machine Learning (QML) algorithms.
For that, available open-source tools are condensed into a market overview and suitable frameworks are systematically selected on a multi-phase, multi-criteria approach.
We build an extended Automated Quantum Machine Learning (AutoQML) framework with QC-specific pipeline steps and decision characteristics for hardware and software constraints.
arXiv Detail & Related papers (2023-10-06T13:21:16Z) - Automatic Componentwise Boosting: An Interpretable AutoML System [1.1709030738577393]
We propose an AutoML system that constructs an interpretable additive model that can be fitted using a highly scalable componentwise boosting algorithm.
Our system provides tools for easy model interpretation such as visualizing partial effects and pairwise interactions.
Despite its restriction to an interpretable model space, our system is competitive in terms of predictive performance on most data sets.
arXiv Detail & Related papers (2021-09-12T18:34:33Z) - Comparison of Automated Machine Learning Tools for SMS Spam Message
Filtering [0.0]
Short Message Service (SMS) is a popular service used for communication by mobile users.
In this work, a classification performance comparison was conducted between three automatic machine learning (AutoML) tools for SMS spam message filtering.
Experimental results showed that ensemble models achieved the best classification performance.
arXiv Detail & Related papers (2021-06-16T10:16:07Z) - Can AutoML outperform humans? An evaluation on popular OpenML datasets
using AutoML Benchmark [0.05156484100374058]
This paper compares four AutoML frameworks on 12 different popular datasets from OpenML.
Results show that the automated frameworks perform better or equal than the machine learning community in 7 out of 12 OpenML tasks.
arXiv Detail & Related papers (2020-09-03T10:25:34Z) - Is deep learning necessary for simple classification tasks? [3.3793659640122717]
Automated machine learning (AutoML) and deep learning (DL) are two cutting-edge paradigms used to solve inductive learning tasks.
We compare AutoML and DL in the context of binary classification on 6 well-characterized public datasets.
We also evaluate a new tool for genetic programming-based AutoML that incorporates deep estimators.
arXiv Detail & Related papers (2020-06-11T18:41:47Z) - AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data [120.2298620652828]
We introduce AutoGluon-Tabular, an open-source AutoML framework that requires only a single line of Python to train highly accurate machine learning models.
Tests on a suite of 50 classification and regression tasks from Kaggle and the OpenML AutoML Benchmark reveal that AutoGluon is faster, more robust, and much more accurate.
arXiv Detail & Related papers (2020-03-13T23:10:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.