An Extensive Experimental Evaluation of Automated Machine Learning
Methods for Recommending Classification Algorithms (Extended Version)
- URL: http://arxiv.org/abs/2009.07430v1
- Date: Wed, 16 Sep 2020 02:36:43 GMT
- Title: An Extensive Experimental Evaluation of Automated Machine Learning
Methods for Recommending Classification Algorithms (Extended Version)
- Authors: M\'arcio P. Basgalupp, Rodrigo C. Barros, Alex G. C. de S\'a, Gisele
L. Pappa, Rafael G. Mantovani, Andr\'e C. P. L. F. de Carvalho, Alex A.
Freitas
- Abstract summary: Three of these methods are based on Evolutionary Algorithms (EAs), and the other is Auto-WEKA, a well-known AutoML method.
We performed controlled experiments where these four AutoML methods were given the same runtime limit for different values of this limit.
In general, the difference in predictive accuracy of the three best AutoML methods was not statistically significant.
- Score: 4.400989370979334
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents an experimental comparison among four Automated Machine
Learning (AutoML) methods for recommending the best classification algorithm
for a given input dataset. Three of these methods are based on Evolutionary
Algorithms (EAs), and the other is Auto-WEKA, a well-known AutoML method based
on the Combined Algorithm Selection and Hyper-parameter optimisation (CASH)
approach. The EA-based methods build classification algorithms from a single
machine learning paradigm: either decision-tree induction, rule induction, or
Bayesian network classification. Auto-WEKA combines algorithm selection and
hyper-parameter optimisation to recommend classification algorithms from
multiple paradigms. We performed controlled experiments where these four AutoML
methods were given the same runtime limit for different values of this limit.
In general, the difference in predictive accuracy of the three best AutoML
methods was not statistically significant. However, the EA evolving
decision-tree induction algorithms has the advantage of producing algorithms
that generate interpretable classification models and that are more scalable to
large datasets, by comparison with many algorithms from other learning
paradigms that can be recommended by Auto-WEKA. We also observed that Auto-WEKA
has shown meta-overfitting, a form of overfitting at the meta-learning level,
rather than at the base-learning level.
Related papers
- A Weighted K-Center Algorithm for Data Subset Selection [70.49696246526199]
Subset selection is a fundamental problem that can play a key role in identifying smaller portions of the training data.
We develop a novel factor 3-approximation algorithm to compute subsets based on the weighted sum of both k-center and uncertainty sampling objective functions.
arXiv Detail & Related papers (2023-12-17T04:41:07Z) - Benchmarking AutoML algorithms on a collection of binary problems [3.3793659640122717]
This paper compares the performance of four different AutoML algorithms: Tree-based Pipeline Optimization Tool (TPOT), Auto-Sklearn, Auto-Sklearn 2, and H2O AutoML.
We confirm that AutoML can identify pipelines that perform well on all included datasets.
arXiv Detail & Related papers (2022-12-06T01:53:50Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - Towards Automated Imbalanced Learning with Deep Hierarchical
Reinforcement Learning [57.163525407022966]
Imbalanced learning is a fundamental challenge in data mining, where there is a disproportionate ratio of training samples in each class.
Over-sampling is an effective technique to tackle imbalanced learning through generating synthetic samples for the minority class.
We propose AutoSMOTE, an automated over-sampling algorithm that can jointly optimize different levels of decisions.
arXiv Detail & Related papers (2022-08-26T04:28:01Z) - Iterative Linear Quadratic Optimization for Nonlinear Control:
Differentiable Programming Algorithmic Templates [9.711326718689495]
We present the implementation of nonlinear control algorithms based on linear and quadratic approximations of the objective from a functional viewpoint.
We derive the computational complexities of all algorithms in a differentiable programming framework and present sufficient optimality conditions.
The algorithms are coded in a differentiable programming language in a publicly available package.
arXiv Detail & Related papers (2022-07-13T17:10:47Z) - Towards Diverse Evaluation of Class Incremental Learning: A Representation Learning Perspective [67.45111837188685]
Class incremental learning (CIL) algorithms aim to continually learn new object classes from incrementally arriving data.
We experimentally analyze neural network models trained by CIL algorithms using various evaluation protocols in representation learning.
arXiv Detail & Related papers (2022-06-16T11:44:11Z) - Algorithm Selection on a Meta Level [58.720142291102135]
We introduce the problem of meta algorithm selection, which essentially asks for the best way to combine a given set of algorithm selectors.
We present a general methodological framework for meta algorithm selection as well as several concrete learning methods as instantiations of this framework.
arXiv Detail & Related papers (2021-07-20T11:23:21Z) - Hybrid Method Based on NARX models and Machine Learning for Pattern
Recognition [0.0]
This work presents a novel technique that integrates the methodologies of machine learning and system identification to solve multiclass problems.
The efficiency of the method was tested by running case studies investigated in machine learning, obtaining better absolute results when compared with classical classification algorithms.
arXiv Detail & Related papers (2021-06-08T00:17:36Z) - Evolving Reinforcement Learning Algorithms [186.62294652057062]
We propose a method for meta-learning reinforcement learning algorithms.
The learned algorithms are domain-agnostic and can generalize to new environments not seen during training.
We highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games.
arXiv Detail & Related papers (2021-01-08T18:55:07Z) - A Robust Experimental Evaluation of Automated Multi-Label Classification
Methods [0.735996217853436]
This paper approaches AutoML for multi-label classification (MLC) problems.
In MLC, each example can be simultaneously associated to several class labels.
Overall, we observe that the most prominent method is the one based on a canonical grammar-based genetic programming (GGP) search method.
arXiv Detail & Related papers (2020-05-16T20:08:04Z) - Unsupervised and Supervised Learning with the Random Forest Algorithm
for Traffic Scenario Clustering and Classification [4.169845583045265]
The goal of this paper is to provide a method, which is able to find categories of traffic scenarios automatically.
The architecture consists of three main components: A microscopic traffic simulation, a clustering technique and a classification technique for the operational phase.
arXiv Detail & Related papers (2020-04-05T08:26:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.