Comparison of Automated Machine Learning Tools for SMS Spam Message
Filtering
- URL: http://arxiv.org/abs/2106.08671v1
- Date: Wed, 16 Jun 2021 10:16:07 GMT
- Title: Comparison of Automated Machine Learning Tools for SMS Spam Message
Filtering
- Authors: Waddah Saeed
- Abstract summary: Short Message Service (SMS) is a popular service used for communication by mobile users.
In this work, a classification performance comparison was conducted between three automatic machine learning (AutoML) tools for SMS spam message filtering.
Experimental results showed that ensemble models achieved the best classification performance.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Short Message Service (SMS) is a very popular service used for communication
by mobile users. However, this popular service can be abused by executing
illegal activities and influencing security risks. Nowadays, many automatic
machine learning (AutoML) tools exist which can help domain experts and lay
users to build high-quality ML models with little or no machine learning
knowledge. In this work, a classification performance comparison was conducted
between three automatic ML tools for SMS spam message filtering. These tools
are mljar-supervised AutoML, H2O AutoML, and Tree-based Pipeline Optimization
Tool (TPOT) AutoML. Experimental results showed that ensemble models achieved
the best classification performance. The Stacked Ensemble model, which was
built using H2O AutoML, achieved the best performance in terms of Log Loss
(0.8370), true positive (1088/1116), and true negative (281/287) metrics. There
is a 19.05\% improvement in Log Loss with respect to TPOT AutoML and 10.53\%
improvement with respect to mljar-supervised AutoML. The satisfactory filtering
performance achieved with AutoML tools provides a potential application for
AutoML tools to automatically determine the best ML model that can perform best
for SMS spam message filtering.
Related papers
- AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML [56.565200973244146]
Automated machine learning (AutoML) accelerates AI development by automating tasks in the development pipeline.
Recent works have started exploiting large language models (LLM) to lessen such burden.
This paper proposes AutoML-Agent, a novel multi-agent framework tailored for full-pipeline AutoML.
arXiv Detail & Related papers (2024-10-03T20:01:09Z) - Position: A Call to Action for a Human-Centered AutoML Paradigm [83.78883610871867]
Automated machine learning (AutoML) was formed around the fundamental objectives of automatically and efficiently configuring machine learning (ML)
We argue that a key to unlocking AutoML's full potential lies in addressing the currently underexplored aspect of user interaction with AutoML systems.
arXiv Detail & Related papers (2024-06-05T15:05:24Z) - The Devil is in the Errors: Leveraging Large Language Models for
Fine-grained Machine Translation Evaluation [93.01964988474755]
AutoMQM is a prompting technique which asks large language models to identify and categorize errors in translations.
We study the impact of labeled data through in-context learning and finetuning.
We then evaluate AutoMQM with PaLM-2 models, and we find that it improves performance compared to just prompting for scores.
arXiv Detail & Related papers (2023-08-14T17:17:21Z) - An Empirical Study on the Usage of Automated Machine Learning Tools [10.901346577426542]
The popularity of automated machine learning (AutoML) tools has increased over the past few years.
Recent work performed qualitative studies on practitioners' experiences of using AutoML tools.
We conducted an empirical study to understand how ML practitioners use AutoML tools in their projects.
arXiv Detail & Related papers (2022-08-28T02:01:58Z) - SubStrat: A Subset-Based Strategy for Faster AutoML [5.833272638548153]
SubStrat is an AutoML optimization strategy that tackles the data size, rather than configuration space.
It wraps existing AutoML tools, and instead of executing them directly on the entire dataset, SubStrat uses a genetic-based algorithm to find a small subset.
It then employs the AutoML tool on the small subset, and finally, it refines the resulted pipeline by executing a restricted, much shorter, AutoML process on the large dataset.
arXiv Detail & Related papers (2022-06-07T07:44:06Z) - VolcanoML: Speeding up End-to-End AutoML via Scalable Search Space
Decomposition [57.06900573003609]
VolcanoML is a framework that decomposes a large AutoML search space into smaller ones.
It supports a Volcano-style execution model, akin to the one supported by modern database systems.
Our evaluation demonstrates that, not only does VolcanoML raise the level of expressiveness for search space decomposition in AutoML, it also leads to actual findings of decomposition strategies.
arXiv Detail & Related papers (2021-07-19T13:23:57Z) - A Neophyte With AutoML: Evaluating the Promises of Automatic Machine
Learning Tools [1.713291434132985]
This paper discusses modern Auto Machine Learning (AutoML) tools from the perspective of a person with little prior experience in Machine Learning (ML)
There are many AutoML tools both ready-to-use and under development, which are created to simplify and democratize usage of ML technologies in everyday life.
arXiv Detail & Related papers (2021-01-14T19:28:57Z) - Leveraging Automated Machine Learning for Text Classification:
Evaluation of AutoML Tools and Comparison with Human Performance [0.07734726150561087]
This work compares four AutoML tools on 13 different popular datasets.
Results show that the AutoML tools perform better than the machine learning community in 4 out of 13 tasks.
arXiv Detail & Related papers (2020-12-07T10:31:13Z) - Auto-Sklearn 2.0: Hands-free AutoML via Meta-Learning [45.643809726832764]
We introduce new AutoML approaches motivated by our winning submission to the second ChaLearn AutoML challenge.
We develop PoSH Auto-sklearn, which enables AutoML systems to work well on large datasets under rigid time limits.
We also propose a solution towards truly hands-free AutoML.
arXiv Detail & Related papers (2020-07-08T12:41:03Z) - AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data [120.2298620652828]
We introduce AutoGluon-Tabular, an open-source AutoML framework that requires only a single line of Python to train highly accurate machine learning models.
Tests on a suite of 50 classification and regression tasks from Kaggle and the OpenML AutoML Benchmark reveal that AutoGluon is faster, more robust, and much more accurate.
arXiv Detail & Related papers (2020-03-13T23:10:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.