DREAM: Debugging and Repairing AutoML Pipelines
- URL: http://arxiv.org/abs/2401.00379v1
- Date: Sun, 31 Dec 2023 02:45:17 GMT
- Title: DREAM: Debugging and Repairing AutoML Pipelines
- Authors: Xiaoyu Zhang, Juan Zhai, Shiqing Ma, Chao Shen
- Abstract summary: We design and implement DREAM, an automatic debug and repairing system for AutoML systems.
It monitors the process of AutoML to collect detailed feedback and automatically repairs bugs by expanding search space.
Our evaluation results show that DREAM can effectively and efficiently repair AutoML bugs.
- Score: 39.83914420717843
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Learning models have become an integrated component of modern software
systems. In response to the challenge of model design, researchers proposed
Automated Machine Learning (AutoML) systems, which automatically search for
model architecture and hyperparameters for a given task. Like other software
systems, existing AutoML systems suffer from bugs. We identify two common and
severe bugs in AutoML, performance bug (i.e., searching for the desired model
takes an unreasonably long time) and ineffective search bug (i.e., AutoML
systems are not able to find an accurate enough model). After analyzing the
workflow of AutoML, we observe that existing AutoML systems overlook potential
opportunities in search space, search method, and search feedback, which
results in performance and ineffective search bugs. Based on our analysis, we
design and implement DREAM, an automatic debugging and repairing system for
AutoML systems. It monitors the process of AutoML to collect detailed feedback
and automatically repairs bugs by expanding search space and leveraging a
feedback-driven search strategy. Our evaluation results show that DREAM can
effectively and efficiently repair AutoML bugs.
Related papers
- AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML [56.565200973244146]
Automated machine learning (AutoML) accelerates AI development by automating tasks in the development pipeline.
Recent works have started exploiting large language models (LLM) to lessen such burden.
This paper proposes AutoML-Agent, a novel multi-agent framework tailored for full-pipeline AutoML.
arXiv Detail & Related papers (2024-10-03T20:01:09Z) - Position: A Call to Action for a Human-Centered AutoML Paradigm [83.78883610871867]
Automated machine learning (AutoML) was formed around the fundamental objectives of automatically and efficiently configuring machine learning (ML)
We argue that a key to unlocking AutoML's full potential lies in addressing the currently underexplored aspect of user interaction with AutoML systems.
arXiv Detail & Related papers (2024-06-05T15:05:24Z) - The Devil is in the Errors: Leveraging Large Language Models for
Fine-grained Machine Translation Evaluation [93.01964988474755]
AutoMQM is a prompting technique which asks large language models to identify and categorize errors in translations.
We study the impact of labeled data through in-context learning and finetuning.
We then evaluate AutoMQM with PaLM-2 models, and we find that it improves performance compared to just prompting for scores.
arXiv Detail & Related papers (2023-08-14T17:17:21Z) - Assessing the Use of AutoML for Data-Driven Software Engineering [10.40771687966477]
AutoML promises to automate the building of end-to-end AI/ML pipelines.
Despite the growing interest and high expectations, there is a dearth of information about the extent to which AutoML is currently adopted.
arXiv Detail & Related papers (2023-07-20T11:14:24Z) - Efficient End-to-End AutoML via Scalable Search Space Decomposition [35.903994093222806]
VolcanoML is a framework that decomposes a large AutoML search space into smaller ones.
It supports a Volcano-style execution model, akin to the one supported by modern database systems.
Our evaluation demonstrates that, not only does VolcanoML raise the level of expressiveness for search space decomposition in AutoML, it also leads to actual findings of decomposition strategies.
arXiv Detail & Related papers (2022-06-19T14:53:29Z) - LightAutoML: AutoML Solution for a Large Financial Services Ecosystem [108.09104876115428]
We present an AutoML system called LightAutoML developed for a large European financial services company.
Our framework was piloted and deployed in numerous applications and performed at the level of the experienced data scientists.
arXiv Detail & Related papers (2021-09-03T13:52:32Z) - Man versus Machine: AutoML and Human Experts' Role in Phishing Detection [4.124446337711138]
This paper compares the performances of six well-known, state-of-the-art AutoML frameworks on ten different phishing datasets.
Our results indicate that AutoML-based models are able to outperform manually developed machine learning models in complex classification tasks.
arXiv Detail & Related papers (2021-08-27T09:26:20Z) - VolcanoML: Speeding up End-to-End AutoML via Scalable Search Space
Decomposition [57.06900573003609]
VolcanoML is a framework that decomposes a large AutoML search space into smaller ones.
It supports a Volcano-style execution model, akin to the one supported by modern database systems.
Our evaluation demonstrates that, not only does VolcanoML raise the level of expressiveness for search space decomposition in AutoML, it also leads to actual findings of decomposition strategies.
arXiv Detail & Related papers (2021-07-19T13:23:57Z) - Interpret-able feedback for AutoML systems [5.5524559605452595]
Automated machine learning (AutoML) systems aim to enable training machine learning (ML) models for non-ML experts.
A shortcoming of these systems is that when they fail to produce a model with high accuracy, the user has no path to improve the model.
We introduce an interpretable data feedback solution for AutoML.
arXiv Detail & Related papers (2021-02-22T18:54:26Z) - GAMA: a General Automated Machine learning Assistant [4.035753155957698]
The General Automated Machine learning Assistant (GAMA) is a modular AutoML system developed to empower users to track and control how AutoML algorithms search for optimal machine learning pipelines.
GAMA allows users to plug in different AutoML and post-processing techniques, logs and visualizes the search process, and supports easy benchmarking.
arXiv Detail & Related papers (2020-07-09T16:16:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.