Leveraging Automated Machine Learning for Text Classification:
Evaluation of AutoML Tools and Comparison with Human Performance
- URL: http://arxiv.org/abs/2012.03575v1
- Date: Mon, 7 Dec 2020 10:31:13 GMT
- Title: Leveraging Automated Machine Learning for Text Classification:
Evaluation of AutoML Tools and Comparison with Human Performance
- Authors: Matthias Blohm, Marc Hanussek and Maximilien Kintz
- Abstract summary: This work compares four AutoML tools on 13 different popular datasets.
Results show that the AutoML tools perform better than the machine learning community in 4 out of 13 tasks.
- Score: 0.07734726150561087
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, Automated Machine Learning (AutoML) has registered increasing
success with respect to tabular data. However, the question arises whether
AutoML can also be applied effectively to text classification tasks. This work
compares four AutoML tools on 13 different popular datasets, including Kaggle
competitions, and opposes human performance. The results show that the AutoML
tools perform better than the machine learning community in 4 out of 13 tasks
and that two stand out.
Related papers
- AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning [54.47116888545878]
AutoAct is an automatic agent learning framework for QA.
It does not rely on large-scale annotated data and synthetic planning trajectories from closed-source models.
arXiv Detail & Related papers (2024-01-10T16:57:24Z) - Large Language Models for Automated Data Science: Introducing CAAFE for
Context-Aware Automated Feature Engineering [52.09178018466104]
We introduce Context-Aware Automated Feature Engineering (CAAFE) to generate semantically meaningful features for datasets.
Despite being methodologically simple, CAAFE improves performance on 11 out of 14 datasets.
We highlight the significance of context-aware solutions that can extend the scope of AutoML systems to semantic AutoML.
arXiv Detail & Related papers (2023-05-05T09:58:40Z) - Towards Green Automated Machine Learning: Status Quo and Future
Directions [71.86820260846369]
AutoML is being criticised for its high resource consumption.
This paper proposes Green AutoML, a paradigm to make the whole AutoML process more environmentally friendly.
arXiv Detail & Related papers (2021-11-10T18:57:27Z) - Benchmarking Multimodal AutoML for Tabular Data with Text Fields [83.43249184357053]
We assemble 18 multimodal data tables that each contain some text fields.
Our benchmark enables researchers to evaluate their own methods for supervised learning with numeric, categorical, and text features.
arXiv Detail & Related papers (2021-11-04T09:29:16Z) - Evaluation of Representation Models for Text Classification with AutoML
Tools [0.9318327342147515]
This work compares three manually created text representations and text embeddings automatically created by AutoML tools.
Results show that straightforward text representations perform better than AutoML tools with automatically created text embeddings.
arXiv Detail & Related papers (2021-06-24T07:19:44Z) - Comparison of Automated Machine Learning Tools for SMS Spam Message
Filtering [0.0]
Short Message Service (SMS) is a popular service used for communication by mobile users.
In this work, a classification performance comparison was conducted between three automatic machine learning (AutoML) tools for SMS spam message filtering.
Experimental results showed that ensemble models achieved the best classification performance.
arXiv Detail & Related papers (2021-06-16T10:16:07Z) - A Neophyte With AutoML: Evaluating the Promises of Automatic Machine
Learning Tools [1.713291434132985]
This paper discusses modern Auto Machine Learning (AutoML) tools from the perspective of a person with little prior experience in Machine Learning (ML)
There are many AutoML tools both ready-to-use and under development, which are created to simplify and democratize usage of ML technologies in everyday life.
arXiv Detail & Related papers (2021-01-14T19:28:57Z) - Can AutoML outperform humans? An evaluation on popular OpenML datasets
using AutoML Benchmark [0.05156484100374058]
This paper compares four AutoML frameworks on 12 different popular datasets from OpenML.
Results show that the automated frameworks perform better or equal than the machine learning community in 7 out of 12 OpenML tasks.
arXiv Detail & Related papers (2020-09-03T10:25:34Z) - Auto-Sklearn 2.0: Hands-free AutoML via Meta-Learning [45.643809726832764]
We introduce new AutoML approaches motivated by our winning submission to the second ChaLearn AutoML challenge.
We develop PoSH Auto-sklearn, which enables AutoML systems to work well on large datasets under rigid time limits.
We also propose a solution towards truly hands-free AutoML.
arXiv Detail & Related papers (2020-07-08T12:41:03Z) - AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data [120.2298620652828]
We introduce AutoGluon-Tabular, an open-source AutoML framework that requires only a single line of Python to train highly accurate machine learning models.
Tests on a suite of 50 classification and regression tasks from Kaggle and the OpenML AutoML Benchmark reveal that AutoGluon is faster, more robust, and much more accurate.
arXiv Detail & Related papers (2020-03-13T23:10:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.