Dynamic Design of Machine Learning Pipelines via Metalearning
- URL: http://arxiv.org/abs/2508.13436v1
- Date: Tue, 19 Aug 2025 01:33:33 GMT
- Title: Dynamic Design of Machine Learning Pipelines via Metalearning
- Authors: Edesio Alcobaça, André C. P. L. F. de Carvalho,
- Abstract summary: This paper introduces a metalearning method for dynamically designing search spaces for AutoML system.<n>The proposed method uses historical metaknowledge to select promising regions of the search space, accelerating the optimization process.<n>According to experiments conducted for this study, the proposed method can reduce runtime by 89% in Random Search and search space.
- Score: 1.1356542363919058
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automated machine learning (AutoML) has democratized the design of machine learning based systems, by automating model selection, hyperparameter tuning and feature engineering. However, the high computational cost associated with traditional search and optimization strategies, such as Random Search, Particle Swarm Optimization and Bayesian Optimization, remains a significant challenge. Moreover, AutoML systems typically explore a large search space, which can lead to overfitting. This paper introduces a metalearning method for dynamically designing search spaces for AutoML system. The proposed method uses historical metaknowledge to select promising regions of the search space, accelerating the optimization process. According to experiments conducted for this study, the proposed method can reduce runtime by 89\% in Random Search and search space by (1.8/13 preprocessor and 4.3/16 classifier), without compromising significant predictive performance. Moreover, the proposed method showed competitive performance when adapted to Auto-Sklearn, reducing its search space. Furthermore, this study encompasses insights into meta-feature selection, meta-model explainability, and the trade-offs inherent in search space reduction strategies.
Related papers
- Automated Design Optimization via Strategic Search with Large Language Models [0.0]
AUTO is a framework that treats design optimization as a gradient-free search problem guided by strategic LLM reasoning.<n>It completes optimizations in approximately 8 hours at an estimated cost of up to $159 per run, compared to an estimated cost of up to $480 with median-wage software developers.
arXiv Detail & Related papers (2025-11-27T17:42:05Z) - An experimental survey and Perspective View on Meta-Learning for Automated Algorithms Selection and Parametrization [0.0]
We provide an overview of the state of the art in this continuously evolving field.<n>AutoML makes machine learning techniques accessible to domain scientists who are interested in applying advanced analytics.
arXiv Detail & Related papers (2025-04-08T16:51:22Z) - A Survey of Automatic Prompt Optimization with Instruction-focused Heuristic-based Search Algorithm [13.332569343755075]
Large Language Models have led to remarkable achievements across a variety of Natural Language Processing tasks.<n>While manual methods can be effective, they typically rely on intuition and do not automatically refine prompts over time.<n>automatic prompt optimization employing-based search algorithms can systematically explore and improve prompts with minimal human oversight.
arXiv Detail & Related papers (2025-02-26T01:42:08Z) - A Survey on Inference Optimization Techniques for Mixture of Experts Models [50.40325411764262]
Large-scale Mixture of Experts (MoE) models offer enhanced model capacity and computational efficiency through conditional computation.<n> deploying and running inference on these models presents significant challenges in computational resources, latency, and energy efficiency.<n>This survey analyzes optimization techniques for MoE models across the entire system stack.
arXiv Detail & Related papers (2024-12-18T14:11:15Z) - Deep Memory Search: A Metaheuristic Approach for Optimizing Heuristic Search [0.0]
We introduce a novel approach called Deep Heuristic Search (DHS), which models metaheuristic search as a memory-driven process.
DHS employs multiple search layers and memory-based exploration-exploitation mechanisms to navigate large, dynamic search spaces.
arXiv Detail & Related papers (2024-10-22T14:16:49Z) - Discovering Preference Optimization Algorithms with and for Large Language Models [50.843710797024805]
offline preference optimization is a key method for enhancing and controlling the quality of Large Language Model (LLM) outputs.
We perform objective discovery to automatically discover new state-of-the-art preference optimization algorithms without (expert) human intervention.
Experiments demonstrate the state-of-the-art performance of DiscoPOP, a novel algorithm that adaptively blends logistic and exponential losses.
arXiv Detail & Related papers (2024-06-12T16:58:41Z) - A Meta-Level Learning Algorithm for Sequential Hyper-Parameter Space
Reduction in AutoML [2.06188179769701]
We present an algorithm that reduces the space for an AutoML tool with negligible drop in its predictive performance.
SHSR is evaluated on 284 classification and 375 regression problems, showing an approximate 30% reduction in execution time with a performance drop of less than 0.1%.
arXiv Detail & Related papers (2023-12-11T11:26:43Z) - Efficient Non-Parametric Optimizer Search for Diverse Tasks [93.64739408827604]
We present the first efficient scalable and general framework that can directly search on the tasks of interest.
Inspired by the innate tree structure of the underlying math expressions, we re-arrange the spaces into a super-tree.
We adopt an adaptation of the Monte Carlo method to tree search, equipped with rejection sampling and equivalent- form detection.
arXiv Detail & Related papers (2022-09-27T17:51:31Z) - Towards Automated Imbalanced Learning with Deep Hierarchical
Reinforcement Learning [57.163525407022966]
Imbalanced learning is a fundamental challenge in data mining, where there is a disproportionate ratio of training samples in each class.
Over-sampling is an effective technique to tackle imbalanced learning through generating synthetic samples for the minority class.
We propose AutoSMOTE, an automated over-sampling algorithm that can jointly optimize different levels of decisions.
arXiv Detail & Related papers (2022-08-26T04:28:01Z) - AutoSpace: Neural Architecture Search with Less Human Interference [84.42680793945007]
Current neural architecture search (NAS) algorithms still require expert knowledge and effort to design a search space for network construction.
We propose a novel differentiable evolutionary framework named AutoSpace, which evolves the search space to an optimal one.
With the learned search space, the performance of recent NAS algorithms can be improved significantly compared with using previously manually designed spaces.
arXiv Detail & Related papers (2021-03-22T13:28:56Z) - Evolving Search Space for Neural Architecture Search [70.71153433676024]
We present a Neural Search-space Evolution (NSE) scheme that amplifies the results from the previous effort by maintaining an optimized search space subset.
We achieve 77.3% top-1 retrain accuracy on ImageNet with 333M FLOPs, which yielded a state-of-the-art performance.
When the latency constraint is adopted, our result also performs better than the previous best-performing mobile models with a 77.9% Top-1 retrain accuracy.
arXiv Detail & Related papers (2020-11-22T01:11:19Z) - RandomNet: Towards Fully Automatic Neural Architecture Design for
Multimodal Learning [7.5352209570833555]
We study the effectiveness of a random search strategy for fully automated multimodal neural architecture search.
Compared to traditional methods that rely on manually crafted feature extractors, our method selects each modality from a large search space with minimal human supervision.
arXiv Detail & Related papers (2020-03-02T20:41:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.