Evaluation of Representation Models for Text Classification with AutoML
Tools
- URL: http://arxiv.org/abs/2106.12798v1
- Date: Thu, 24 Jun 2021 07:19:44 GMT
- Title: Evaluation of Representation Models for Text Classification with AutoML
Tools
- Authors: Sebastian Br\"andle, Marc Hanussek, Matthias Blohm, and Maximilien
Kintz
- Abstract summary: This work compares three manually created text representations and text embeddings automatically created by AutoML tools.
Results show that straightforward text representations perform better than AutoML tools with automatically created text embeddings.
- Score: 0.9318327342147515
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automated Machine Learning (AutoML) has gained increasing success on tabular
data in recent years. However, processing unstructured data like text is a
challenge and not widely supported by open-source AutoML tools. This work
compares three manually created text representations and text embeddings
automatically created by AutoML tools. Our benchmark includes four popular
open-source AutoML tools and eight datasets for text classification purposes.
The results show that straightforward text representations perform better than
AutoML tools with automatically created text embeddings.
Related papers
- AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML [56.565200973244146]
Automated machine learning (AutoML) accelerates AI development by automating tasks in the development pipeline.
Recent works have started exploiting large language models (LLM) to lessen such burden.
This paper proposes AutoML-Agent, a novel multi-agent framework tailored for full-pipeline AutoML.
arXiv Detail & Related papers (2024-10-03T20:01:09Z) - AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning [54.47116888545878]
AutoAct is an automatic agent learning framework for QA.
It does not rely on large-scale annotated data and synthetic planning trajectories from closed-source models.
arXiv Detail & Related papers (2024-01-10T16:57:24Z) - Large Language Models for Automated Data Science: Introducing CAAFE for
Context-Aware Automated Feature Engineering [52.09178018466104]
We introduce Context-Aware Automated Feature Engineering (CAAFE) to generate semantically meaningful features for datasets.
Despite being methodologically simple, CAAFE improves performance on 11 out of 14 datasets.
We highlight the significance of context-aware solutions that can extend the scope of AutoML systems to semantic AutoML.
arXiv Detail & Related papers (2023-05-05T09:58:40Z) - An Empirical Study on the Usage of Automated Machine Learning Tools [10.901346577426542]
The popularity of automated machine learning (AutoML) tools has increased over the past few years.
Recent work performed qualitative studies on practitioners' experiences of using AutoML tools.
We conducted an empirical study to understand how ML practitioners use AutoML tools in their projects.
arXiv Detail & Related papers (2022-08-28T02:01:58Z) - Benchmarking Multimodal AutoML for Tabular Data with Text Fields [83.43249184357053]
We assemble 18 multimodal data tables that each contain some text fields.
Our benchmark enables researchers to evaluate their own methods for supervised learning with numeric, categorical, and text features.
arXiv Detail & Related papers (2021-11-04T09:29:16Z) - Privileged Zero-Shot AutoML [16.386335031156]
This work improves the quality of automated machine learning (AutoML) systems by using dataset and function descriptions.
We show that zero-shot AutoML reduces running and prediction times from minutes to milliseconds, consistently across datasets.
arXiv Detail & Related papers (2021-06-25T16:31:05Z) - Comparison of Automated Machine Learning Tools for SMS Spam Message
Filtering [0.0]
Short Message Service (SMS) is a popular service used for communication by mobile users.
In this work, a classification performance comparison was conducted between three automatic machine learning (AutoML) tools for SMS spam message filtering.
Experimental results showed that ensemble models achieved the best classification performance.
arXiv Detail & Related papers (2021-06-16T10:16:07Z) - A Neophyte With AutoML: Evaluating the Promises of Automatic Machine
Learning Tools [1.713291434132985]
This paper discusses modern Auto Machine Learning (AutoML) tools from the perspective of a person with little prior experience in Machine Learning (ML)
There are many AutoML tools both ready-to-use and under development, which are created to simplify and democratize usage of ML technologies in everyday life.
arXiv Detail & Related papers (2021-01-14T19:28:57Z) - Leveraging Automated Machine Learning for Text Classification:
Evaluation of AutoML Tools and Comparison with Human Performance [0.07734726150561087]
This work compares four AutoML tools on 13 different popular datasets.
Results show that the AutoML tools perform better than the machine learning community in 4 out of 13 tasks.
arXiv Detail & Related papers (2020-12-07T10:31:13Z) - Auto-Sklearn 2.0: Hands-free AutoML via Meta-Learning [45.643809726832764]
We introduce new AutoML approaches motivated by our winning submission to the second ChaLearn AutoML challenge.
We develop PoSH Auto-sklearn, which enables AutoML systems to work well on large datasets under rigid time limits.
We also propose a solution towards truly hands-free AutoML.
arXiv Detail & Related papers (2020-07-08T12:41:03Z) - AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data [120.2298620652828]
We introduce AutoGluon-Tabular, an open-source AutoML framework that requires only a single line of Python to train highly accurate machine learning models.
Tests on a suite of 50 classification and regression tasks from Kaggle and the OpenML AutoML Benchmark reveal that AutoGluon is faster, more robust, and much more accurate.
arXiv Detail & Related papers (2020-03-13T23:10:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.