PSEO: Optimizing Post-hoc Stacking Ensemble Through Hyperparameter Tuning
- URL: http://arxiv.org/abs/2508.05144v1
- Date: Thu, 07 Aug 2025 08:25:44 GMT
- Title: PSEO: Optimizing Post-hoc Stacking Ensemble Through Hyperparameter Tuning
- Authors: Beicheng Xu, Wei Liu, Keyao Ding, Yupeng Lu, Bin Cui,
- Abstract summary: We propose PSEO, a framework for post-hoc stacking ensemble optimization.<n>sys achieves the best average test rank (2.96) among 16 methods, including post-hoc designs in recent AutoML systems.
- Score: 21.71071582871805
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Combined Algorithm Selection and Hyperparameter Optimization (CASH) problem is fundamental in Automated Machine Learning (AutoML). Inspired by the success of ensemble learning, recent AutoML systems construct post-hoc ensembles for final predictions rather than relying on the best single model. However, while most CASH methods conduct extensive searches for the optimal single model, they typically employ fixed strategies during the ensemble phase that fail to adapt to specific task characteristics. To tackle this issue, we propose PSEO, a framework for post-hoc stacking ensemble optimization. First, we conduct base model selection through binary quadratic programming, with a trade-off between diversity and performance. Furthermore, we introduce two mechanisms to fully realize the potential of multi-layer stacking. Finally, PSEO builds a hyperparameter space and searches for the optimal post-hoc ensemble strategy within it. Empirical results on 80 public datasets show that \sys achieves the best average test rank (2.96) among 16 methods, including post-hoc designs in recent AutoML systems and state-of-the-art ensemble learning methods.
Related papers
- SPIO: Ensemble and Selective Strategies via LLM-Based Multi-Agent Planning in Automated Data Science [1.1343849658875087]
Large Language Models (LLMs) have revolutionized automated data analytics and machine learning by enabling dynamic reasoning and adaptability.<n>We propose SPIO, a novel framework that orchestrates multi-agent planning across four key modules.<n>In each module, dedicated planning agents independently generate candidate strategies that cascade into subsequent stages, fostering comprehensive exploration.
arXiv Detail & Related papers (2025-03-30T04:45:32Z) - Collab: Controlled Decoding using Mixture of Agents for LLM Alignment [90.6117569025754]
Reinforcement learning from human feedback has emerged as an effective technique to align Large Language models.<n>Controlled Decoding provides a mechanism for aligning a model at inference time without retraining.<n>We propose a mixture of agent-based decoding strategies leveraging the existing off-the-shelf aligned LLM policies.
arXiv Detail & Related papers (2025-03-27T17:34:25Z) - Heterogeneous Swarms: Jointly Optimizing Model Roles and Weights for Multi-LLM Systems [102.36545569092777]
We propose Heterogeneous Swarms, an algorithm to design multi-LLM systems by jointly optimizing model roles and weights.<n>Experiments demonstrate that Heterogeneous Swarms outperforms 15 role- and/or weight-based baselines by 18.5% on average across 12 tasks.
arXiv Detail & Related papers (2025-02-06T21:27:11Z) - An incremental preference elicitation-based approach to learning potentially non-monotonic preferences in multi-criteria sorting [53.36437745983783]
We first construct a max-margin optimization-based model to model potentially non-monotonic preferences.
We devise information amount measurement methods and question selection strategies to pinpoint the most informative alternative in each iteration.
Two incremental preference elicitation-based algorithms are developed to learn potentially non-monotonic preferences.
arXiv Detail & Related papers (2024-09-04T14:36:20Z) - Decoding-Time Language Model Alignment with Multiple Objectives [116.42095026960598]
Existing methods primarily focus on optimizing LMs for a single reward function, limiting their adaptability to varied objectives.
Here, we propose $textbfmulti-objective decoding (MOD)$, a decoding-time algorithm that outputs the next token from a linear combination of predictions.
We show why existing approaches can be sub-optimal even in natural settings and obtain optimality guarantees for our method.
arXiv Detail & Related papers (2024-06-27T02:46:30Z) - Unleashing the Potential of Large Language Models as Prompt Optimizers: Analogical Analysis with Gradient-based Model Optimizers [108.72225067368592]
We propose a novel perspective to investigate the design of large language models (LLMs)-based prompts.<n>We identify two pivotal factors in model parameter learning: update direction and update method.<n>We develop a capable Gradient-inspired Prompt-based GPO.
arXiv Detail & Related papers (2024-02-27T15:05:32Z) - BOtied: Multi-objective Bayesian optimization with tied multivariate ranks [33.414682601242006]
In this paper, we show a natural connection between non-dominated solutions and the extreme quantile of the joint cumulative distribution function.
Motivated by this link, we propose the Pareto-compliant CDF indicator and the associated acquisition function, BOtied.
Our experiments on a variety of synthetic and real-world problems demonstrate that BOtied outperforms state-of-the-art MOBO acquisition functions.
arXiv Detail & Related papers (2023-06-01T04:50:06Z) - Data-driven Mixed Integer Optimization through Probabilistic Multi-variable Branching [8.03915440701838]
We propose a Pre-trained Mixed Optimization framework (PreMIO) that accelerates online mixed integer program (MIP) solving with offline datasets and machine learning models.<n>Our method is based on a data-driven multi-variable cardinality branching procedure that splits the feasible region using hyperplanes chosen by the concentration inequalities.
arXiv Detail & Related papers (2023-05-21T05:11:30Z) - Agent-based Collaborative Random Search for Hyper-parameter Tuning and
Global Function Optimization [0.0]
This paper proposes an agent-based collaborative technique for finding near-optimal values for any arbitrary set of hyper- parameters in a machine learning model.
The behavior of the presented model, specifically against the changes in its design parameters, is investigated in both machine learning and global function optimization applications.
arXiv Detail & Related papers (2023-03-03T21:10:17Z) - DivBO: Diversity-aware CASH for Ensemble Learning [26.18421492435029]
We propose DivBO, a diversity-aware framework to inject explicit search of diversity into the CASH problems.
In the framework, we propose to use a diversity surrogate to predict the pair-wise diversity of two unseen configurations.
We show that DivBO achieves the best average ranks (1.82 and 1.73) on both validation and test errors among 10 compared methods.
arXiv Detail & Related papers (2023-02-07T04:53:21Z) - A Deep Neural Networks ensemble workflow from hyperparameter search to
inference leveraging GPU clusters [0.0]
AutoML seeks to automatically build ensembles of Deep Neural Networks (DNNs) to achieve qualitative predictions.
We propose a new AutoML to build a larger library of accurate and diverse individual models to then construct ensembles.
New ensemble selection method based on a multi-objective greedy algorithm is proposed to generate accurate ensembles.
arXiv Detail & Related papers (2022-08-30T08:04:19Z) - Consolidated learning -- a domain-specific model-free optimization
strategy with examples for XGBoost and MIMIC-IV [4.370097023410272]
This paper proposes a new formulation of the tuning problem, called consolidated learning.
In such settings, we are interested in the total optimization time rather than tuning for a single task.
We demonstrate the effectiveness of this approach through an empirical study for XGBoost algorithm and the collection of predictive tasks extracted from the MIMIC-IV medical database.
arXiv Detail & Related papers (2022-01-27T21:38:53Z) - Stepwise Model Selection for Sequence Prediction via Deep Kernel
Learning [100.83444258562263]
We propose a novel Bayesian optimization (BO) algorithm to tackle the challenge of model selection in this setting.
In order to solve the resulting multiple black-box function optimization problem jointly and efficiently, we exploit potential correlations among black-box functions.
We are the first to formulate the problem of stepwise model selection (SMS) for sequence prediction, and to design and demonstrate an efficient joint-learning algorithm for this purpose.
arXiv Detail & Related papers (2020-01-12T09:42:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.