Related papers: CMA-ES for Post Hoc Ensembling in AutoML: A Great Success and Salvageable Failure

CMA-ES for Post Hoc Ensembling in AutoML: A Great Success and Salvageable Failure

URL: http://arxiv.org/abs/2307.00286v1
Date: Sat, 1 Jul 2023 09:47:59 GMT
Title: CMA-ES for Post Hoc Ensembling in AutoML: A Great Success and Salvageable Failure
Authors: Lennart Purucker, Joeran Beel
Abstract summary: Auto-Sklearn 1 uses only low-quality validation data for post hoc ensembling. We compared CMA-ES, state-of-the-art gradient-free numerical optimization, to GES on the 71 classification datasets from the AutoML benchmark for AutoGluon. For the metric balanced accuracy, CMA-ES does not overfit and outperforms GES significantly.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Many state-of-the-art automated machine learning (AutoML) systems use greedy ensemble selection (GES) by Caruana et al. (2004) to ensemble models found during model selection post hoc. Thereby, boosting predictive performance and likely following Auto-Sklearn 1's insight that alternatives, like stacking or gradient-free numerical optimization, overfit. Overfitting in Auto-Sklearn 1 is much more likely than in other AutoML systems because it uses only low-quality validation data for post hoc ensembling. Therefore, we were motivated to analyze whether Auto-Sklearn 1's insight holds true for systems with higher-quality validation data. Consequently, we compared the performance of covariance matrix adaptation evolution strategy (CMA-ES), state-of-the-art gradient-free numerical optimization, to GES on the 71 classification datasets from the AutoML benchmark for AutoGluon. We found that Auto-Sklearn's insight depends on the chosen metric. For the metric ROC AUC, CMA-ES overfits drastically and is outperformed by GES -- statistically significantly for multi-class classification. For the metric balanced accuracy, CMA-ES does not overfit and outperforms GES significantly. Motivated by the successful application of CMA-ES for balanced accuracy, we explored methods to stop CMA-ES from overfitting for ROC AUC. We propose a method to normalize the weights produced by CMA-ES, inspired by GES, that avoids overfitting for CMA-ES and makes CMA-ES perform better than or similar to GES for ROC AUC.

Related papers

EDCA - An Evolutionary Data-Centric AutoML Framework for Efficient Pipelines [0.276240219662896]
This work presents EDCA, an Evolutionary Data Centric AutoML framework. Data quality is usually an overlooked part of AutoML and continues to be a manual and time-consuming task. EDCA was compared to FLAML and TPOT, two frameworks at the top of the AutoML benchmarks.
arXiv Detail & Related papers (2025-03-06T11:46:07Z)
Frequency-Aware Masked Autoencoders for Human Activity Recognition using Accelerometers [0.1499944454332829]
Supervised machine learning and deep learning algorithms have long been used to extract meaningful activity information from raw accelerometry data. We propose a novel spectrogram-based loss function named the log-scale mean magnitude (LMM) loss for human activity recognition. Our findings demonstrate that the LMM loss is a robust and effective method for pretraining MAE models on accelerometer data for HAR.
arXiv Detail & Related papers (2025-02-17T14:57:51Z)
Self-DenseMobileNet: A Robust Framework for Lung Nodule Classification using Self-ONN and Stacking-based Meta-Classifier [1.2300841481611335]
Self-DenseMobileNet is designed to enhance the classification of nodules and non-nodules in chest radiographs (CXRs) Our framework integrates advanced image standardization and enhancement techniques to optimize the input quality. When tested on an external dataset, the framework maintained strong generalizability with an accuracy of 89.40%.
arXiv Detail & Related papers (2024-10-16T14:04:06Z)
MetaFollower: Adaptable Personalized Autonomous Car Following [63.90050686330677]
We propose an adaptable personalized car-following framework - MetaFollower. We first utilize Model-Agnostic Meta-Learning (MAML) to extract common driving knowledge from various CF events. We additionally combine Long Short-Term Memory (LSTM) and Intelligent Driver Model (IDM) to reflect temporal heterogeneity with high interpretability.
arXiv Detail & Related papers (2024-06-23T15:30:40Z)
Ransomware detection using stacked autoencoder for feature selection [0.0]
The study meticulously analyzes the autoencoder's learned weights and activations to identify essential features for distinguishing ransomware families from other malware. The proposed model achieves an exceptional 99% accuracy in ransomware classification, surpassing the Extreme Gradient Boosting (XGBoost) algorithm.
arXiv Detail & Related papers (2024-02-17T17:31:48Z)
Online Hyperparameter Optimization for Class-Incremental Learning [99.70569355681174]
Class-incremental learning (CIL) aims to train a classification model while the number of classes increases phase-by-phase. An inherent challenge of CIL is the stability-plasticity tradeoff, i.e., CIL models should keep stable to retain old knowledge and keep plastic to absorb new knowledge. We propose an online learning method that can adaptively optimize the tradeoff without knowing the setting as a priori.
arXiv Detail & Related papers (2023-01-11T17:58:51Z)
Outlier-Insensitive Kalman Filtering Using NUV Priors [24.413595920205907]
In practice, observations are corrupted by outliers, severely impairing the Kalman filter (KF)s performance. In this work, an outlier-insensitive KF is proposed, where is achieved by modeling each potential outlier as a normally distributed random variable with unknown variance (NUV) The NUVs variances are estimated online, using both expectation-maximization (EM) and alternating robustness (AM)
arXiv Detail & Related papers (2022-10-12T11:00:13Z)
Towards Automated Imbalanced Learning with Deep Hierarchical Reinforcement Learning [57.163525407022966]
Imbalanced learning is a fundamental challenge in data mining, where there is a disproportionate ratio of training samples in each class. Over-sampling is an effective technique to tackle imbalanced learning through generating synthetic samples for the minority class. We propose AutoSMOTE, an automated over-sampling algorithm that can jointly optimize different levels of decisions.
arXiv Detail & Related papers (2022-08-26T04:28:01Z)
Model-Agnostic Multitask Fine-tuning for Few-shot Vision-Language Transfer Learning [59.38343286807997]
We propose Model-Agnostic Multitask Fine-tuning (MAMF) for vision-language models on unseen tasks. Compared with model-agnostic meta-learning (MAML), MAMF discards the bi-level optimization and uses only first-order gradients. We show that MAMF consistently outperforms the classical fine-tuning method for few-shot transfer learning on five benchmark datasets.
arXiv Detail & Related papers (2022-03-09T17:26:53Z)
Stochastic Optimization of Areas Under Precision-Recall Curves with Provable Convergence [66.83161885378192]
Area under ROC (AUROC) and precision-recall curves (AUPRC) are common metrics for evaluating classification performance for imbalanced problems. We propose a technical method to optimize AUPRC for deep learning.
arXiv Detail & Related papers (2021-04-18T06:22:21Z)
Robusta: Robust AutoML for Feature Selection via Reinforcement Learning [24.24652530951966]
We propose the first robust AutoML framework, Robusta--based on reinforcement learning (RL) We show that the framework is able to improve the model robustness by up to 22% while maintaining competitive accuracy on benign samples.
arXiv Detail & Related papers (2021-01-15T03:12:29Z)
Cauchy-Schwarz Regularized Autoencoder [68.80569889599434]
Variational autoencoders (VAE) are a powerful and widely-used class of generative models. We introduce a new constrained objective based on the Cauchy-Schwarz divergence, which can be computed analytically for GMMs. Our objective improves upon variational auto-encoding models in density estimation, unsupervised clustering, semi-supervised learning, and face analysis.
arXiv Detail & Related papers (2021-01-06T17:36:26Z)
Equipment Failure Analysis for Oil and Gas Industry with an Ensemble Predictive Model [0.0]
We show that the proposed solution can perform much better when using the SMO training algorithm. The classification performance of this predictive model is considerably better than the SVM with and without SMO training algorithm.
arXiv Detail & Related papers (2020-12-30T04:14:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.