Algorithm Selection with Probing Trajectories: Benchmarking the Choice of Classifier Model
- URL: http://arxiv.org/abs/2501.11414v1
- Date: Mon, 20 Jan 2025 11:28:45 GMT
- Title: Algorithm Selection with Probing Trajectories: Benchmarking the Choice of Classifier Model
- Authors: Quentin Renau, Emma Hart,
- Abstract summary: We conduct a benchmark study using 17 different classifiers and three types of trajectory on a classification task using the BBOB benchmark suite.
We find that the choice of classifier has a significant impact, showing that feature-based and interval-based models are the best choices.
- Score: 0.20718016474717196
- License:
- Abstract: Recent approaches to training algorithm selectors in the black-box optimisation domain have advocated for the use of training data that is algorithm-centric in order to encapsulate information about how an algorithm performs on an instance, rather than relying on information derived from features of the instance itself. Probing-trajectories that consist of a sequence of objective performance per function evaluation obtained from a short run of an algorithm have recently shown particular promise in training accurate selectors. However, training models on this type of data requires an appropriately chosen classifier given the sequential nature of the data. There are currently no clear guidelines for choosing the most appropriate classifier for algorithm selection using time-series data from the plethora of models available. To address this, we conduct a large benchmark study using 17 different classifiers and three types of trajectory on a classification task using the BBOB benchmark suite using both leave-one-instance out and leave-one-problem out cross-validation. In contrast to previous studies using tabular data, we find that the choice of classifier has a significant impact, showing that feature-based and interval-based models are the best choices.
Related papers
- TSDS: Data Selection for Task-Specific Model Finetuning [39.19448080265558]
The efficacy of task-specific finetuning largely depends on the selection of appropriate training data.
We present TSDS (Task-Specific Data Selection), a framework to select data for task-specific model finetuning.
We show that instruction tuning using data selected by our method with a 1% selection ratio often outperforms using the full dataset.
arXiv Detail & Related papers (2024-10-15T05:54:17Z) - Interpretable classifiers for tabular data via discretization and feature selection [4.195816579137846]
We introduce a method for computing immediately human interpretable yet accurate classifiers from tabular data.
We demonstrate the approach via 12 experiments, obtaining results with accuracies comparable to ones obtained via random forests, XGBoost, and existing results for the same datasets in the literature.
arXiv Detail & Related papers (2024-02-08T13:58:16Z) - DsDm: Model-Aware Dataset Selection with Datamodels [81.01744199870043]
Standard practice is to filter for examples that match human notions of data quality.
We find that selecting according to similarity with "high quality" data sources may not increase (and can even hurt) performance compared to randomly selecting data.
Our framework avoids handpicked notions of data quality, and instead models explicitly how the learning process uses train datapoints to predict on the target tasks.
arXiv Detail & Related papers (2024-01-23T17:22:00Z) - Towards Free Data Selection with General-Purpose Models [71.92151210413374]
A desirable data selection algorithm can efficiently choose the most informative samples to maximize the utility of limited annotation budgets.
Current approaches, represented by active learning methods, typically follow a cumbersome pipeline that iterates the time-consuming model training and batch data selection repeatedly.
FreeSel bypasses the heavy batch selection process, achieving a significant improvement in efficiency and being 530x faster than existing active learning methods.
arXiv Detail & Related papers (2023-09-29T15:50:14Z) - Active Learning for Non-Parametric Choice Models [9.737139416043949]
We study the problem of actively learning a non-parametric choice model based on consumers' decisions.
We present a negative result showing that such choice models may not be identifiable.
arXiv Detail & Related papers (2022-08-05T18:26:33Z) - Automated Algorithm Selection: from Feature-Based to Feature-Free
Approaches [0.5801044612920815]
We propose a novel technique for algorithm-selection, applicable to optimisation in which there is implicit sequential information encapsulated in the data.
We train two types of recurrent neural networks to predict a packing in online bin-packing, selecting from four well-known domains.
arXiv Detail & Related papers (2022-03-24T23:59:50Z) - Compactness Score: A Fast Filter Method for Unsupervised Feature
Selection [66.84571085643928]
We propose a fast unsupervised feature selection method, named as, Compactness Score (CSUFS) to select desired features.
Our proposed algorithm seems to be more accurate and efficient compared with existing algorithms.
arXiv Detail & Related papers (2022-01-31T13:01:37Z) - Pattern Sampling for Shapelet-based Time Series Classification [4.94950858749529]
Subsequence-based time series classification algorithms provide accurate and interpretable models.
These algorithms are based on exhaustive search for highly discriminative subsequences.
Pattern sampling has been proposed as an effective alternative to mitigate the pattern explosion phenomenon.
arXiv Detail & Related papers (2021-02-16T23:35:10Z) - Online Active Model Selection for Pre-trained Classifiers [72.84853880948894]
We design an online selective sampling approach that actively selects informative examples to label and outputs the best model with high probability at any round.
Our algorithm can be used for online prediction tasks for both adversarial and streams.
arXiv Detail & Related papers (2020-10-19T19:53:15Z) - Run2Survive: A Decision-theoretic Approach to Algorithm Selection based
on Survival Analysis [75.64261155172856]
survival analysis (SA) naturally supports censored data and offers appropriate ways to use such data for learning distributional models of algorithm runtime.
We leverage such models as a basis of a sophisticated decision-theoretic approach to algorithm selection, which we dub Run2Survive.
In an extensive experimental study with the standard benchmark ASlib, our approach is shown to be highly competitive and in many cases even superior to state-of-the-art AS approaches.
arXiv Detail & Related papers (2020-07-06T15:20:17Z) - Stepwise Model Selection for Sequence Prediction via Deep Kernel
Learning [100.83444258562263]
We propose a novel Bayesian optimization (BO) algorithm to tackle the challenge of model selection in this setting.
In order to solve the resulting multiple black-box function optimization problem jointly and efficiently, we exploit potential correlations among black-box functions.
We are the first to formulate the problem of stepwise model selection (SMS) for sequence prediction, and to design and demonstrate an efficient joint-learning algorithm for this purpose.
arXiv Detail & Related papers (2020-01-12T09:42:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.