Automated Machine Learning Techniques for Data Streams
- URL: http://arxiv.org/abs/2106.07317v1
- Date: Mon, 14 Jun 2021 11:42:46 GMT
- Title: Automated Machine Learning Techniques for Data Streams
- Authors: Alexandru-Ionut Imbrea
- Abstract summary: This paper surveys the state-of-the-art open-source AutoML tools, applies them to data collected from streams, and measures how their performance changes over time.
The results show that off-the-shelf AutoML tools can provide satisfactory results but in the presence of concept drift, detection or adaptation techniques have to be applied to maintain the predictive accuracy over time.
- Score: 91.3755431537592
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automated machine learning techniques benefited from tremendous research
progress in recently. These developments and the continuous-growing demand for
machine learning experts led to the development of numerous AutoML tools.
However, these tools assume that the entire training dataset is available
upfront and that the underlying distribution does not change over time. These
assumptions do not hold in a data stream mining setting where an unbounded
stream of data cannot be stored and is likely to manifest concept drift.
Industry applications of machine learning on streaming data become more popular
due to the increasing adoption of real-time streaming patterns in IoT,
microservices architectures, web analytics, and other fields. The research
summarized in this paper surveys the state-of-the-art open-source AutoML tools,
applies them to data collected from streams, and measures how their performance
changes over time. For comparative purposes, batch, batch incremental and
instance incremental estimators are applied and compared. Moreover, a
meta-learning technique for online algorithm selection based on meta-feature
extraction is proposed and compared while model replacement and continual
AutoML techniques are discussed. The results show that off-the-shelf AutoML
tools can provide satisfactory results but in the presence of concept drift,
detection or adaptation techniques have to be applied to maintain the
predictive accuracy over time.
Related papers
- Attribute-to-Delete: Machine Unlearning via Datamodel Matching [65.13151619119782]
Machine unlearning -- efficiently removing a small "forget set" training data on a pre-divertrained machine learning model -- has recently attracted interest.
Recent research shows that machine unlearning techniques do not hold up in such a challenging setting.
arXiv Detail & Related papers (2024-10-30T17:20:10Z) - PILOT: A Pre-Trained Model-Based Continual Learning Toolbox [71.63186089279218]
This paper introduces a pre-trained model-based continual learning toolbox known as PILOT.
On the one hand, PILOT implements some state-of-the-art class-incremental learning algorithms based on pre-trained models, such as L2P, DualPrompt, and CODA-Prompt.
On the other hand, PILOT fits typical class-incremental learning algorithms within the context of pre-trained models to evaluate their effectiveness.
arXiv Detail & Related papers (2023-09-13T17:55:11Z) - Benchmarking Automated Machine Learning Methods for Price Forecasting
Applications [58.720142291102135]
We show the possibility of substituting manually created ML pipelines with automated machine learning (AutoML) solutions.
Based on the CRISP-DM process, we split the manual ML pipeline into a machine learning and non-machine learning part.
We show in a case study for the industrial use case of price forecasting, that domain knowledge combined with AutoML can weaken the dependence on ML experts.
arXiv Detail & Related papers (2023-04-28T10:27:38Z) - Automatic Data Augmentation via Invariance-Constrained Learning [94.27081585149836]
Underlying data structures are often exploited to improve the solution of learning tasks.
Data augmentation induces these symmetries during training by applying multiple transformations to the input data.
This work tackles these issues by automatically adapting the data augmentation while solving the learning task.
arXiv Detail & Related papers (2022-09-29T18:11:01Z) - Online AutoML: An adaptive AutoML framework for online learning [6.6389732792316005]
This study aims to automate pipeline design for online learning while continuously adapting to data drift.
This system combines the inherent adaptation capabilities of online learners with the fast automated pipeline (re)optimization capabilities of AutoML.
arXiv Detail & Related papers (2022-01-24T15:37:20Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - Man versus Machine: AutoML and Human Experts' Role in Phishing Detection [4.124446337711138]
This paper compares the performances of six well-known, state-of-the-art AutoML frameworks on ten different phishing datasets.
Our results indicate that AutoML-based models are able to outperform manually developed machine learning models in complex classification tasks.
arXiv Detail & Related papers (2021-08-27T09:26:20Z) - AutoML to Date and Beyond: Challenges and Opportunities [30.60364966752454]
AutoML tools aim to make machine learning accessible for non-machine learning experts.
We introduce a new classification system for AutoML systems.
We lay out a roadmap for the future, pinpointing the research required to further automate the end-to-end machine learning pipeline.
arXiv Detail & Related papers (2020-10-21T06:08:21Z) - Adaptation Strategies for Automated Machine Learning on Evolving Data [7.843067454030999]
This study is to understand the effect of data stream challenges such as concept drift on the performance of AutoML methods.
We propose 6 concept drift adaptation strategies and evaluate their effectiveness on different AutoML approaches.
arXiv Detail & Related papers (2020-06-09T14:29:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.