VolcanoML: Speeding up End-to-End AutoML via Scalable Search Space
Decomposition
- URL: http://arxiv.org/abs/2107.08861v2
- Date: Tue, 20 Jul 2021 08:37:49 GMT
- Title: VolcanoML: Speeding up End-to-End AutoML via Scalable Search Space
Decomposition
- Authors: Yang Li, Yu Shen, Wentao Zhang, Jiawei Jiang, Bolin Ding, Yaliang Li,
Jingren Zhou, Zhi Yang, Wentao Wu, Ce Zhang and Bin Cui
- Abstract summary: VolcanoML is a framework that decomposes a large AutoML search space into smaller ones.
It supports a Volcano-style execution model, akin to the one supported by modern database systems.
Our evaluation demonstrates that, not only does VolcanoML raise the level of expressiveness for search space decomposition in AutoML, it also leads to actual findings of decomposition strategies.
- Score: 57.06900573003609
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: End-to-end AutoML has attracted intensive interests from both academia and
industry, which automatically searches for ML pipelines in a space induced by
feature engineering, algorithm/model selection, and hyper-parameter tuning.
Existing AutoML systems, however, suffer from scalability issues when applying
to application domains with large, high-dimensional search spaces. We present
VolcanoML, a scalable and extensible framework that facilitates systematic
exploration of large AutoML search spaces. VolcanoML introduces and implements
basic building blocks that decompose a large search space into smaller ones,
and allows users to utilize these building blocks to compose an execution plan
for the AutoML problem at hand. VolcanoML further supports a Volcano-style
execution model - akin to the one supported by modern database systems - to
execute the plan constructed. Our evaluation demonstrates that, not only does
VolcanoML raise the level of expressiveness for search space decomposition in
AutoML, it also leads to actual findings of decomposition strategies that are
significantly more efficient than the ones employed by state-of-the-art AutoML
systems such as auto-sklearn.
Related papers
- AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML [56.565200973244146]
Automated machine learning (AutoML) accelerates AI development by automating tasks in the development pipeline.
Recent works have started exploiting large language models (LLM) to lessen such burden.
This paper proposes AutoML-Agent, a novel multi-agent framework tailored for full-pipeline AutoML.
arXiv Detail & Related papers (2024-10-03T20:01:09Z) - Position: A Call to Action for a Human-Centered AutoML Paradigm [83.78883610871867]
Automated machine learning (AutoML) was formed around the fundamental objectives of automatically and efficiently configuring machine learning (ML)
We argue that a key to unlocking AutoML's full potential lies in addressing the currently underexplored aspect of user interaction with AutoML systems.
arXiv Detail & Related papers (2024-06-05T15:05:24Z) - AutoML in Heavily Constrained Applications [24.131387687157382]
We propose CAML, which uses meta-learning to automatically adapt its own AutoML parameters.
The dynamic AutoML strategy of CAML takes user-defined constraints into account and obtains constraint-satisfying pipelines with high predictive performance.
arXiv Detail & Related papers (2023-06-29T13:05:12Z) - OmniForce: On Human-Centered, Large Model Empowered and Cloud-Edge
Collaborative AutoML System [85.8338446357469]
We introduce OmniForce, a human-centered AutoML system that yields both human-assisted ML and ML-assisted human techniques.
We show how OmniForce can put an AutoML system into practice and build adaptive AI in open-environment scenarios.
arXiv Detail & Related papers (2023-03-01T13:35:22Z) - Efficient End-to-End AutoML via Scalable Search Space Decomposition [35.903994093222806]
VolcanoML is a framework that decomposes a large AutoML search space into smaller ones.
It supports a Volcano-style execution model, akin to the one supported by modern database systems.
Our evaluation demonstrates that, not only does VolcanoML raise the level of expressiveness for search space decomposition in AutoML, it also leads to actual findings of decomposition strategies.
arXiv Detail & Related papers (2022-06-19T14:53:29Z) - SubStrat: A Subset-Based Strategy for Faster AutoML [5.833272638548153]
SubStrat is an AutoML optimization strategy that tackles the data size, rather than configuration space.
It wraps existing AutoML tools, and instead of executing them directly on the entire dataset, SubStrat uses a genetic-based algorithm to find a small subset.
It then employs the AutoML tool on the small subset, and finally, it refines the resulted pipeline by executing a restricted, much shorter, AutoML process on the large dataset.
arXiv Detail & Related papers (2022-06-07T07:44:06Z) - PyGlove: Symbolic Programming for Automated Machine Learning [88.15565138144042]
We introduce a new way of programming AutoML based on symbolic programming.
Under this paradigm, ML programs are mutable, thus can be manipulated easily by another program.
We show that PyGlove users can easily convert a static program into a search space, quickly iterate on the search spaces and search algorithms, and craft complex search flows.
arXiv Detail & Related papers (2021-01-21T19:05:44Z) - Evolution of Scikit-Learn Pipelines with Dynamic Structured Grammatical
Evolution [1.5224436211478214]
This paper describes a novel grammar-based framework that adapts Dynamic Structured Grammatical Evolution (DSGE) to the evolution of Scikit-Learn classification pipelines.
The experimental results include comparing AutoML-DSGE to another grammar-based AutoML framework, Resilient ClassificationPipeline Evolution (RECIPE)
arXiv Detail & Related papers (2020-04-01T09:31:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.