Is deep learning necessary for simple classification tasks?
- URL: http://arxiv.org/abs/2006.06730v1
- Date: Thu, 11 Jun 2020 18:41:47 GMT
- Title: Is deep learning necessary for simple classification tasks?
- Authors: Joseph D. Romano, Trang T. Le, Weixuan Fu, and Jason H. Moore
- Abstract summary: Automated machine learning (AutoML) and deep learning (DL) are two cutting-edge paradigms used to solve inductive learning tasks.
We compare AutoML and DL in the context of binary classification on 6 well-characterized public datasets.
We also evaluate a new tool for genetic programming-based AutoML that incorporates deep estimators.
- Score: 3.3793659640122717
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Automated machine learning (AutoML) and deep learning (DL) are two
cutting-edge paradigms used to solve a myriad of inductive learning tasks. In
spite of their successes, little guidance exists for when to choose one
approach over the other in the context of specific real-world problems.
Furthermore, relatively few tools exist that allow the integration of both
AutoML and DL in the same analysis to yield results combining both of their
strengths. Here, we seek to address both of these issues, by (1.) providing a
head-to-head comparison of AutoML and DL in the context of binary
classification on 6 well-characterized public datasets, and (2.) evaluating a
new tool for genetic programming-based AutoML that incorporates deep
estimators. Our observations suggest that AutoML outperforms simple DL
classifiers when trained on similar datasets for binary classification but
integrating DL into AutoML improves classification performance even further.
However, the substantial time needed to train AutoML+DL pipelines will likely
outweigh performance advantages in many applications.
Related papers
- AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML [56.565200973244146]
Automated machine learning (AutoML) accelerates AI development by automating tasks in the development pipeline.
Recent works have started exploiting large language models (LLM) to lessen such burden.
This paper proposes AutoML-Agent, a novel multi-agent framework tailored for full-pipeline AutoML.
arXiv Detail & Related papers (2024-10-03T20:01:09Z) - Align$^2$LLaVA: Cascaded Human and Large Language Model Preference Alignment for Multi-modal Instruction Curation [56.75665429851673]
This paper introduces a novel instruction curation algorithm, derived from two unique perspectives, human and LLM preference alignment.
Experiments demonstrate that we can maintain or even improve model performance by compressing synthetic multimodal instructions by up to 90%.
arXiv Detail & Related papers (2024-09-27T08:20:59Z) - Automated Machine Learning for Multi-Label Classification [3.2634122554914002]
We devise a novel AutoML approach for single-label classification tasks, consisting of two algorithms at most.
We investigate how well AutoML approaches that form the state of the art for single-label classification tasks scale with the increased problem complexity of AutoML for multi-label classification.
arXiv Detail & Related papers (2024-02-28T09:40:36Z) - The Devil is in the Errors: Leveraging Large Language Models for
Fine-grained Machine Translation Evaluation [93.01964988474755]
AutoMQM is a prompting technique which asks large language models to identify and categorize errors in translations.
We study the impact of labeled data through in-context learning and finetuning.
We then evaluate AutoMQM with PaLM-2 models, and we find that it improves performance compared to just prompting for scores.
arXiv Detail & Related papers (2023-08-14T17:17:21Z) - Many or Few Samples? Comparing Transfer, Contrastive and Meta-Learning
in Encrypted Traffic Classification [68.19713459228369]
We compare transfer learning, meta-learning and contrastive learning against reference Machine Learning (ML) tree-based and monolithic DL models.
We show that (i) using large datasets we can obtain more general representations, (ii) contrastive learning is the best methodology.
While ML tree-based cannot handle large tasks but fits well small tasks, by means of reusing learned representations, DL methods are reaching tree-based models performance also for small tasks.
arXiv Detail & Related papers (2023-05-21T11:20:49Z) - Benchmarking AutoML algorithms on a collection of binary problems [3.3793659640122717]
This paper compares the performance of four different AutoML algorithms: Tree-based Pipeline Optimization Tool (TPOT), Auto-Sklearn, Auto-Sklearn 2, and H2O AutoML.
We confirm that AutoML can identify pipelines that perform well on all included datasets.
arXiv Detail & Related papers (2022-12-06T01:53:50Z) - Automatic Componentwise Boosting: An Interpretable AutoML System [1.1709030738577393]
We propose an AutoML system that constructs an interpretable additive model that can be fitted using a highly scalable componentwise boosting algorithm.
Our system provides tools for easy model interpretation such as visualizing partial effects and pairwise interactions.
Despite its restriction to an interpretable model space, our system is competitive in terms of predictive performance on most data sets.
arXiv Detail & Related papers (2021-09-12T18:34:33Z) - Naive Automated Machine Learning -- A Late Baseline for AutoML [0.0]
Automated Machine Learning (AutoML) is the problem of automatically finding the pipeline with the best generalization performance on some given dataset.
We present Naive AutoML, a very simple solution to AutoML that exploits important meta-knowledge about machine learning problems.
arXiv Detail & Related papers (2021-03-18T19:52:12Z) - Auto-Sklearn 2.0: Hands-free AutoML via Meta-Learning [45.643809726832764]
We introduce new AutoML approaches motivated by our winning submission to the second ChaLearn AutoML challenge.
We develop PoSH Auto-sklearn, which enables AutoML systems to work well on large datasets under rigid time limits.
We also propose a solution towards truly hands-free AutoML.
arXiv Detail & Related papers (2020-07-08T12:41:03Z) - Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and
Robust AutoDL [53.40030379661183]
Auto-PyTorch is a framework to enable fully automated deep learning (AutoDL)
It combines multi-fidelity optimization with portfolio construction for warmstarting and ensembling of deep neural networks (DNNs)
We show that Auto-PyTorch performs better than several state-of-the-art competitors on average.
arXiv Detail & Related papers (2020-06-24T15:15:17Z) - AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data [120.2298620652828]
We introduce AutoGluon-Tabular, an open-source AutoML framework that requires only a single line of Python to train highly accurate machine learning models.
Tests on a suite of 50 classification and regression tasks from Kaggle and the OpenML AutoML Benchmark reveal that AutoGluon is faster, more robust, and much more accurate.
arXiv Detail & Related papers (2020-03-13T23:10:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.