Related papers: Automated Machine Learning for Multi-Label Classification

Automated Machine Learning for Multi-Label Classification

URL: http://arxiv.org/abs/2402.18198v1
Date: Wed, 28 Feb 2024 09:40:36 GMT
Title: Automated Machine Learning for Multi-Label Classification
Authors: Marcel Wever
Abstract summary: We devise a novel AutoML approach for single-label classification tasks, consisting of two algorithms at most. We investigate how well AutoML approaches that form the state of the art for single-label classification tasks scale with the increased problem complexity of AutoML for multi-label classification.
Score: 3.2634122554914002
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Automated machine learning (AutoML) aims to select and configure machine learning algorithms and combine them into machine learning pipelines tailored to a dataset at hand. For supervised learning tasks, most notably binary and multinomial classification, aka single-label classification (SLC), such AutoML approaches have shown promising results. However, the task of multi-label classification (MLC), where data points are associated with a set of class labels instead of a single class label, has received much less attention so far. In the context of multi-label classification, the data-specific selection and configuration of multi-label classifiers are challenging even for experts in the field, as it is a high-dimensional optimization problem with multi-level hierarchical dependencies. While for SLC, the space of machine learning pipelines is already huge, the size of the MLC search space outnumbers the one of SLC by several orders. In the first part of this thesis, we devise a novel AutoML approach for single-label classification tasks optimizing pipelines of machine learning algorithms, consisting of two algorithms at most. This approach is then extended first to optimize pipelines of unlimited length and eventually configure the complex hierarchical structures of multi-label classification methods. Furthermore, we investigate how well AutoML approaches that form the state of the art for single-label classification tasks scale with the increased problem complexity of AutoML for multi-label classification. In the second part, we explore how methods for SLC and MLC could be configured more flexibly to achieve better generalization performance and how to increase the efficiency of execution-based AutoML systems.

Related papers

Class-Independent Increment: An Efficient Approach for Multi-label Class-Incremental Learning [49.65841002338575]
This paper focuses on the challenging yet practical multi-label class-incremental learning (MLCIL) problem. We propose a novel class-independent incremental network (CINet) to extract multiple class-level embeddings for multi-label samples. It learns and preserves the knowledge of different classes by constructing class-specific tokens.
arXiv Detail & Related papers (2025-03-01T14:40:52Z)
LLM-AutoDiff: Auto-Differentiate Any LLM Workflow [58.56731133392544]
We introduce LLM-AutoDiff: a novel framework for Automatic Prompt Engineering (APE) LLMs-AutoDiff treats each textual input as a trainable parameter and uses a frozen backward engine to generate feedback-akin to textual gradients. It consistently outperforms existing textual gradient baselines in both accuracy and training cost.
arXiv Detail & Related papers (2025-01-28T03:18:48Z)
Extreme AutoML: Analysis of Classification, Regression, and NLP Performance [0.19791587637442667]
Extreme Learning Machines (ELMs) use a fundamentally different type of neural architecture, producing better results at a significantly discounted computational cost. We benchmark the Extreme AutoML technology against Google's AutoML using several popular classification data sets from the University of California at Irvine's (UCI) repository.
arXiv Detail & Related papers (2024-12-09T21:10:22Z)
AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML [56.565200973244146]
Automated machine learning (AutoML) accelerates AI development by automating tasks in the development pipeline. Recent works have started exploiting large language models (LLM) to lessen such burden. This paper proposes AutoML-Agent, a novel multi-agent framework tailored for full-pipeline AutoML.
arXiv Detail & Related papers (2024-10-03T20:01:09Z)
Automated Contrastive Learning Strategy Search for Time Series [48.68664732145665]
We present an Automated Machine Learning (AutoML) practice at Microsoft, which automatically learns Contrastive Learning (AutoCL) for time series datasets and tasks. We first construct a principled search space of size over $3times1012$, covering data augmentation, embedding transformation, contrastive pair construction, and contrastive losses. Further, we introduce an efficient reinforcement learning algorithm, which optimize CLS from the performance on the validation tasks, to obtain effective CLS within the space.
arXiv Detail & Related papers (2024-03-19T11:24:14Z)
Bringing Quantum Algorithms to Automated Machine Learning: A Systematic Review of AutoML Frameworks Regarding Extensibility for QML Algorithms [1.4469725791865982]
This work describes the selection approach and analysis of existing AutoML frameworks regarding their capability of incorporating Quantum Machine Learning (QML) algorithms. For that, available open-source tools are condensed into a market overview and suitable frameworks are systematically selected on a multi-phase, multi-criteria approach. We build an extended Automated Quantum Machine Learning (AutoQML) framework with QC-specific pipeline steps and decision characteristics for hardware and software constraints.
arXiv Detail & Related papers (2023-10-06T13:21:16Z)
Multi-Instance Partial-Label Learning: Towards Exploiting Dual Inexact Supervision [53.530957567507365]
In some real-world tasks, each training sample is associated with a candidate label set that contains one ground-truth label and some false positive labels. In this paper, we formalize such problems as multi-instance partial-label learning (MIPL) Existing multi-instance learning algorithms and partial-label learning algorithms are suboptimal for solving MIPL problems.
arXiv Detail & Related papers (2022-12-18T03:28:51Z)
Benchmarking AutoML algorithms on a collection of binary problems [3.3793659640122717]
This paper compares the performance of four different AutoML algorithms: Tree-based Pipeline Optimization Tool (TPOT), Auto-Sklearn, Auto-Sklearn 2, and H2O AutoML. We confirm that AutoML can identify pipelines that perform well on all included datasets.
arXiv Detail & Related papers (2022-12-06T01:53:50Z)
Binary Classification from Multiple Unlabeled Datasets via Surrogate Set Classification [94.55805516167369]
We propose a new approach for binary classification from m U-sets for $mge2$. Our key idea is to consider an auxiliary classification task called surrogate set classification (SSC)
arXiv Detail & Related papers (2021-02-01T07:36:38Z)
Many-Class Few-Shot Learning on Multi-Granularity Class Hierarchy [57.68486382473194]
We study many-class few-shot (MCFS) problem in both supervised learning and meta-learning settings. In this paper, we leverage the class hierarchy as a prior knowledge to train a coarse-to-fine classifier. The model, "memory-augmented hierarchical-classification network (MahiNet)", performs coarse-to-fine classification where each coarse class can cover multiple fine classes.
arXiv Detail & Related papers (2020-06-28T01:11:34Z)
Is deep learning necessary for simple classification tasks? [3.3793659640122717]
Automated machine learning (AutoML) and deep learning (DL) are two cutting-edge paradigms used to solve inductive learning tasks. We compare AutoML and DL in the context of binary classification on 6 well-characterized public datasets. We also evaluate a new tool for genetic programming-based AutoML that incorporates deep estimators.
arXiv Detail & Related papers (2020-06-11T18:41:47Z)
A Robust Experimental Evaluation of Automated Multi-Label Classification Methods [0.735996217853436]
This paper approaches AutoML for multi-label classification (MLC) problems. In MLC, each example can be simultaneously associated to several class labels. Overall, we observe that the most prominent method is the one based on a canonical grammar-based genetic programming (GGP) search method.
arXiv Detail & Related papers (2020-05-16T20:08:04Z)
Evolution of Scikit-Learn Pipelines with Dynamic Structured Grammatical Evolution [1.5224436211478214]
This paper describes a novel grammar-based framework that adapts Dynamic Structured Grammatical Evolution (DSGE) to the evolution of Scikit-Learn classification pipelines. The experimental results include comparing AutoML-DSGE to another grammar-based AutoML framework, Resilient ClassificationPipeline Evolution (RECIPE)
arXiv Detail & Related papers (2020-04-01T09:31:34Z)
AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data [120.2298620652828]
We introduce AutoGluon-Tabular, an open-source AutoML framework that requires only a single line of Python to train highly accurate machine learning models. Tests on a suite of 50 classification and regression tasks from Kaggle and the OpenML AutoML Benchmark reveal that AutoGluon is faster, more robust, and much more accurate.
arXiv Detail & Related papers (2020-03-13T23:10:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.