Adaptation Strategies for Automated Machine Learning on Evolving Data
- URL: http://arxiv.org/abs/2006.06480v3
- Date: Tue, 10 May 2022 08:52:36 GMT
- Title: Adaptation Strategies for Automated Machine Learning on Evolving Data
- Authors: Bilge Celik and Joaquin Vanschoren
- Abstract summary: This study is to understand the effect of data stream challenges such as concept drift on the performance of AutoML methods.
We propose 6 concept drift adaptation strategies and evaluate their effectiveness on different AutoML approaches.
- Score: 7.843067454030999
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automated Machine Learning (AutoML) systems have been shown to efficiently
build good models for new datasets. However, it is often not clear how well
they can adapt when the data evolves over time. The main goal of this study is
to understand the effect of data stream challenges such as concept drift on the
performance of AutoML methods, and which adaptation strategies can be employed
to make them more robust. To that end, we propose 6 concept drift adaptation
strategies and evaluate their effectiveness on different AutoML approaches. We
do this for a variety of AutoML approaches for building machine learning
pipelines, including those that leverage Bayesian optimization, genetic
programming, and random search with automated stacking. These are evaluated
empirically on real-world and synthetic data streams with different types of
concept drift. Based on this analysis, we propose ways to develop more
sophisticated and robust AutoML techniques.
Related papers
- SELA: Tree-Search Enhanced LLM Agents for Automated Machine Learning [14.702694298483445]
Tree-Search Enhanced LLM Agents (SELA) is an agent-based system that leverages Monte Carlo Tree Search (MCTS) to optimize the AutoML process.
SELA represents pipeline configurations as trees, enabling agents to conduct experiments intelligently and iteratively refine their strategies.
In an extensive evaluation across 20 machine learning datasets, we compare the performance of traditional and agent-based AutoML methods.
arXiv Detail & Related papers (2024-10-22T17:56:08Z) - AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML [56.565200973244146]
Automated machine learning (AutoML) accelerates AI development by automating tasks in the development pipeline.
Recent works have started exploiting large language models (LLM) to lessen such burden.
This paper proposes AutoML-Agent, a novel multi-agent framework tailored for full-pipeline AutoML.
arXiv Detail & Related papers (2024-10-03T20:01:09Z) - AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving [68.73885845181242]
We propose an Automatic Data Engine (AIDE) that automatically identifies issues, efficiently curates data, improves the model through auto-labeling, and verifies the model through generation of diverse scenarios.
We further establish a benchmark for open-world detection on AV datasets to comprehensively evaluate various learning paradigms, demonstrating our method's superior performance at a reduced cost.
arXiv Detail & Related papers (2024-03-26T04:27:56Z) - AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning [54.47116888545878]
AutoAct is an automatic agent learning framework for QA.
It does not rely on large-scale annotated data and synthetic planning trajectories from closed-source models.
arXiv Detail & Related papers (2024-01-10T16:57:24Z) - OmniForce: On Human-Centered, Large Model Empowered and Cloud-Edge
Collaborative AutoML System [85.8338446357469]
We introduce OmniForce, a human-centered AutoML system that yields both human-assisted ML and ML-assisted human techniques.
We show how OmniForce can put an AutoML system into practice and build adaptive AI in open-environment scenarios.
arXiv Detail & Related papers (2023-03-01T13:35:22Z) - Online AutoML: An adaptive AutoML framework for online learning [6.6389732792316005]
This study aims to automate pipeline design for online learning while continuously adapting to data drift.
This system combines the inherent adaptation capabilities of online learners with the fast automated pipeline (re)optimization capabilities of AutoML.
arXiv Detail & Related papers (2022-01-24T15:37:20Z) - Automated Machine Learning Techniques for Data Streams [91.3755431537592]
This paper surveys the state-of-the-art open-source AutoML tools, applies them to data collected from streams, and measures how their performance changes over time.
The results show that off-the-shelf AutoML tools can provide satisfactory results but in the presence of concept drift, detection or adaptation techniques have to be applied to maintain the predictive accuracy over time.
arXiv Detail & Related papers (2021-06-14T11:42:46Z) - AutoFlow: Learning a Better Training Set for Optical Flow [62.40293188964933]
AutoFlow is a method to render training data for optical flow.
AutoFlow achieves state-of-the-art accuracy in pre-training both PWC-Net and RAFT.
arXiv Detail & Related papers (2021-04-29T17:55:23Z) - Interpret-able feedback for AutoML systems [5.5524559605452595]
Automated machine learning (AutoML) systems aim to enable training machine learning (ML) models for non-ML experts.
A shortcoming of these systems is that when they fail to produce a model with high accuracy, the user has no path to improve the model.
We introduce an interpretable data feedback solution for AutoML.
arXiv Detail & Related papers (2021-02-22T18:54:26Z) - Evolution of Scikit-Learn Pipelines with Dynamic Structured Grammatical
Evolution [1.5224436211478214]
This paper describes a novel grammar-based framework that adapts Dynamic Structured Grammatical Evolution (DSGE) to the evolution of Scikit-Learn classification pipelines.
The experimental results include comparing AutoML-DSGE to another grammar-based AutoML framework, Resilient ClassificationPipeline Evolution (RECIPE)
arXiv Detail & Related papers (2020-04-01T09:31:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.