Automated Machine Learning for Multi-Label Classification
- URL: http://arxiv.org/abs/2402.18198v1
- Date: Wed, 28 Feb 2024 09:40:36 GMT
- Title: Automated Machine Learning for Multi-Label Classification
- Authors: Marcel Wever
- Abstract summary: We devise a novel AutoML approach for single-label classification tasks, consisting of two algorithms at most.
We investigate how well AutoML approaches that form the state of the art for single-label classification tasks scale with the increased problem complexity of AutoML for multi-label classification.
- Score: 3.2634122554914002
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Automated machine learning (AutoML) aims to select and configure machine
learning algorithms and combine them into machine learning pipelines tailored
to a dataset at hand. For supervised learning tasks, most notably binary and
multinomial classification, aka single-label classification (SLC), such AutoML
approaches have shown promising results. However, the task of multi-label
classification (MLC), where data points are associated with a set of class
labels instead of a single class label, has received much less attention so
far. In the context of multi-label classification, the data-specific selection
and configuration of multi-label classifiers are challenging even for experts
in the field, as it is a high-dimensional optimization problem with multi-level
hierarchical dependencies. While for SLC, the space of machine learning
pipelines is already huge, the size of the MLC search space outnumbers the one
of SLC by several orders.
In the first part of this thesis, we devise a novel AutoML approach for
single-label classification tasks optimizing pipelines of machine learning
algorithms, consisting of two algorithms at most. This approach is then
extended first to optimize pipelines of unlimited length and eventually
configure the complex hierarchical structures of multi-label classification
methods. Furthermore, we investigate how well AutoML approaches that form the
state of the art for single-label classification tasks scale with the increased
problem complexity of AutoML for multi-label classification.
In the second part, we explore how methods for SLC and MLC could be
configured more flexibly to achieve better generalization performance and how
to increase the efficiency of execution-based AutoML systems.
Related papers
- AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML [56.565200973244146]
Automated machine learning (AutoML) accelerates AI development by automating tasks in the development pipeline.
Recent works have started exploiting large language models (LLM) to lessen such burden.
This paper proposes AutoML-Agent, a novel multi-agent framework tailored for full-pipeline AutoML.
arXiv Detail & Related papers (2024-10-03T20:01:09Z) - Automated Contrastive Learning Strategy Search for Time Series [48.68664732145665]
We present an Automated Machine Learning (AutoML) practice at Microsoft, which automatically learns Contrastive Learning (AutoCL) for time series datasets and tasks.
We first construct a principled search space of size over $3times1012$, covering data augmentation, embedding transformation, contrastive pair construction, and contrastive losses.
Further, we introduce an efficient reinforcement learning algorithm, which optimize CLS from the performance on the validation tasks, to obtain effective CLS within the space.
arXiv Detail & Related papers (2024-03-19T11:24:14Z) - Bringing Quantum Algorithms to Automated Machine Learning: A Systematic
Review of AutoML Frameworks Regarding Extensibility for QML Algorithms [1.4469725791865982]
This work describes the selection approach and analysis of existing AutoML frameworks regarding their capability of incorporating Quantum Machine Learning (QML) algorithms.
For that, available open-source tools are condensed into a market overview and suitable frameworks are systematically selected on a multi-phase, multi-criteria approach.
We build an extended Automated Quantum Machine Learning (AutoQML) framework with QC-specific pipeline steps and decision characteristics for hardware and software constraints.
arXiv Detail & Related papers (2023-10-06T13:21:16Z) - Multi-Instance Partial-Label Learning: Towards Exploiting Dual Inexact
Supervision [53.530957567507365]
In some real-world tasks, each training sample is associated with a candidate label set that contains one ground-truth label and some false positive labels.
In this paper, we formalize such problems as multi-instance partial-label learning (MIPL)
Existing multi-instance learning algorithms and partial-label learning algorithms are suboptimal for solving MIPL problems.
arXiv Detail & Related papers (2022-12-18T03:28:51Z) - Benchmarking AutoML algorithms on a collection of binary problems [3.3793659640122717]
This paper compares the performance of four different AutoML algorithms: Tree-based Pipeline Optimization Tool (TPOT), Auto-Sklearn, Auto-Sklearn 2, and H2O AutoML.
We confirm that AutoML can identify pipelines that perform well on all included datasets.
arXiv Detail & Related papers (2022-12-06T01:53:50Z) - Binary Classification from Multiple Unlabeled Datasets via Surrogate Set
Classification [94.55805516167369]
We propose a new approach for binary classification from m U-sets for $mge2$.
Our key idea is to consider an auxiliary classification task called surrogate set classification (SSC)
arXiv Detail & Related papers (2021-02-01T07:36:38Z) - Many-Class Few-Shot Learning on Multi-Granularity Class Hierarchy [57.68486382473194]
We study many-class few-shot (MCFS) problem in both supervised learning and meta-learning settings.
In this paper, we leverage the class hierarchy as a prior knowledge to train a coarse-to-fine classifier.
The model, "memory-augmented hierarchical-classification network (MahiNet)", performs coarse-to-fine classification where each coarse class can cover multiple fine classes.
arXiv Detail & Related papers (2020-06-28T01:11:34Z) - Is deep learning necessary for simple classification tasks? [3.3793659640122717]
Automated machine learning (AutoML) and deep learning (DL) are two cutting-edge paradigms used to solve inductive learning tasks.
We compare AutoML and DL in the context of binary classification on 6 well-characterized public datasets.
We also evaluate a new tool for genetic programming-based AutoML that incorporates deep estimators.
arXiv Detail & Related papers (2020-06-11T18:41:47Z) - A Robust Experimental Evaluation of Automated Multi-Label Classification
Methods [0.735996217853436]
This paper approaches AutoML for multi-label classification (MLC) problems.
In MLC, each example can be simultaneously associated to several class labels.
Overall, we observe that the most prominent method is the one based on a canonical grammar-based genetic programming (GGP) search method.
arXiv Detail & Related papers (2020-05-16T20:08:04Z) - Evolution of Scikit-Learn Pipelines with Dynamic Structured Grammatical
Evolution [1.5224436211478214]
This paper describes a novel grammar-based framework that adapts Dynamic Structured Grammatical Evolution (DSGE) to the evolution of Scikit-Learn classification pipelines.
The experimental results include comparing AutoML-DSGE to another grammar-based AutoML framework, Resilient ClassificationPipeline Evolution (RECIPE)
arXiv Detail & Related papers (2020-04-01T09:31:34Z) - AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data [120.2298620652828]
We introduce AutoGluon-Tabular, an open-source AutoML framework that requires only a single line of Python to train highly accurate machine learning models.
Tests on a suite of 50 classification and regression tasks from Kaggle and the OpenML AutoML Benchmark reveal that AutoGluon is faster, more robust, and much more accurate.
arXiv Detail & Related papers (2020-03-13T23:10:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.