RF+clust for Leave-One-Problem-Out Performance Prediction
- URL: http://arxiv.org/abs/2301.09524v2
- Date: Tue, 24 Jan 2023 09:38:54 GMT
- Title: RF+clust for Leave-One-Problem-Out Performance Prediction
- Authors: Ana Nikolikj, Carola Doerr, Tome Eftimov
- Abstract summary: We study leave-one-problem-out (LOPO) performance prediction.
We analyze whether standard random forest (RF) model predictions can be improved by calibrating them with a weighted average of performance values.
- Score: 0.9281671380673306
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Per-instance automated algorithm configuration and selection are gaining
significant moments in evolutionary computation in recent years. Two crucial,
sometimes implicit, ingredients for these automated machine learning (AutoML)
methods are 1) feature-based representations of the problem instances and 2)
performance prediction methods that take the features as input to estimate how
well a specific algorithm instance will perform on a given problem instance.
Non-surprisingly, common machine learning models fail to make predictions for
instances whose feature-based representation is underrepresented or not covered
in the training data, resulting in poor generalization ability of the models
for problems not seen during training.In this work, we study
leave-one-problem-out (LOPO) performance prediction. We analyze whether
standard random forest (RF) model predictions can be improved by calibrating
them with a weighted average of performance values obtained by the algorithm on
problem instances that are sufficiently close to the problem for which a
performance prediction is sought, measured by cosine similarity in feature
space. While our RF+clust approach obtains more accurate performance prediction
for several problems, its predictive power crucially depends on the chosen
similarity threshold as well as on the feature portfolio for which the cosine
similarity is measured, thereby opening a new angle for feature selection in a
zero-shot learning setting, as LOPO is termed in machine learning.
Related papers
- Attribute-to-Delete: Machine Unlearning via Datamodel Matching [65.13151619119782]
Machine unlearning -- efficiently removing a small "forget set" training data on a pre-divertrained machine learning model -- has recently attracted interest.
Recent research shows that machine unlearning techniques do not hold up in such a challenging setting.
arXiv Detail & Related papers (2024-10-30T17:20:10Z) - Assessing the Generalizability of a Performance Predictive Model [0.6070952062639761]
We propose a workflow to estimate the generalizability of a predictive model for algorithm performance.
The results show that generalizability patterns in the landscape feature space are reflected in the performance space.
arXiv Detail & Related papers (2023-05-31T12:50:44Z) - Representation Learning with Multi-Step Inverse Kinematics: An Efficient
and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity.
We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level.
Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z) - Improving Adaptive Conformal Prediction Using Self-Supervised Learning [72.2614468437919]
We train an auxiliary model with a self-supervised pretext task on top of an existing predictive model and use the self-supervised error as an additional feature to estimate nonconformity scores.
We empirically demonstrate the benefit of the additional information using both synthetic and real data on the efficiency (width), deficit, and excess of conformal prediction intervals.
arXiv Detail & Related papers (2023-02-23T18:57:14Z) - Tokenization Consistency Matters for Generative Models on Extractive NLP
Tasks [54.306234256074255]
We identify the issue of tokenization inconsistency that is commonly neglected in training generative models.
This issue damages the extractive nature of these tasks after the input and output are tokenized inconsistently.
We show that, with consistent tokenization, the model performs better in both in-domain and out-of-domain datasets.
arXiv Detail & Related papers (2022-12-19T23:33:21Z) - An Advantage Using Feature Selection with a Quantum Annealer [0.0]
Feature selection is a technique in statistical prediction modeling that identifies features in a record with a strong statistical connection to the target variable.
This paper tests this intuition against classical methods by utilizing open-source data sets and evaluate the efficacy of each trained statistical model.
arXiv Detail & Related papers (2022-11-17T18:32:26Z) - Explainable Landscape Analysis in Automated Algorithm Performance
Prediction [0.0]
We investigate the expressiveness of problem landscape features utilized by different supervised machine learning models in automated algorithm performance prediction.
The experimental results point out that the selection of the supervised ML method is crucial, since different supervised ML regression models utilize the problem landscape features differently.
arXiv Detail & Related papers (2022-03-22T15:54:17Z) - Efficient and Differentiable Conformal Prediction with General Function
Classes [96.74055810115456]
We propose a generalization of conformal prediction to multiple learnable parameters.
We show that it achieves approximate valid population coverage and near-optimal efficiency within class.
Experiments show that our algorithm is able to learn valid prediction sets and improve the efficiency significantly.
arXiv Detail & Related papers (2022-02-22T18:37:23Z) - Non-Clairvoyant Scheduling with Predictions Revisited [77.86290991564829]
In non-clairvoyant scheduling, the task is to find an online strategy for scheduling jobs with a priori unknown processing requirements.
We revisit this well-studied problem in a recently popular learning-augmented setting that integrates (untrusted) predictions in algorithm design.
We show that these predictions have desired properties, admit a natural error measure as well as algorithms with strong performance guarantees.
arXiv Detail & Related papers (2022-02-21T13:18:11Z) - Quantum-Assisted Feature Selection for Vehicle Price Prediction Modeling [0.0]
We study metrics for encoding the search as a binary model, such as Generalized Mean Information Coefficient and Pearson Correlation Coefficient.
We achieve accuracy scores of 0.9 for finding optimal subsets on synthetic data using a new metric that we define.
Our findings show that by leveraging quantum-assisted routines we find solutions that increase the quality of predictive model output.
arXiv Detail & Related papers (2021-04-08T20:48:44Z) - Landscape-Aware Fixed-Budget Performance Regression and Algorithm
Selection for Modular CMA-ES Variants [1.0965065178451106]
We show that it is possible to achieve high-quality performance predictions with off-the-shelf supervised learning approaches.
We test this approach on a portfolio of very similar algorithms, which we choose from the family of modular CMA-ES algorithms.
arXiv Detail & Related papers (2020-06-17T13:34:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.