Related papers: Towards Explainable Exploratory Landscape Analysis: Extreme Feature Selection for Classifying BBOB Functions

Towards Explainable Exploratory Landscape Analysis: Extreme Feature Selection for Classifying BBOB Functions

URL: http://arxiv.org/abs/2102.00736v1
Date: Mon, 1 Feb 2021 10:04:28 GMT
Title: Towards Explainable Exploratory Landscape Analysis: Extreme Feature Selection for Classifying BBOB Functions
Authors: Quentin Renau, Johann Dreo, Carola Doerr and Benjamin Doerr
Abstract summary: We show that a surprisingly small number of features -- often less than four -- can suffice to achieve a 98% accuracy. We show that the classification accuracy transfers to settings in which several instances are involved in training and testing.
Score: 4.932130498861987
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Facilitated by the recent advances of Machine Learning (ML), the automated design of optimization heuristics is currently shaking up evolutionary computation (EC). Where the design of hand-picked guidelines for choosing a most suitable heuristic has long dominated research activities in the field, automatically trained heuristics are now seen to outperform human-derived choices even for well-researched optimization tasks. ML-based EC is therefore not any more a futuristic vision, but has become an integral part of our community. A key criticism that ML-based heuristics are often faced with is their potential lack of explainability, which may hinder future developments. This applies in particular to supervised learning techniques which extrapolate algorithms' performance based on exploratory landscape analysis (ELA). In such applications, it is not uncommon to use dozens of problem features to build the models underlying the specific algorithm selection or configuration task. Our goal in this work is to analyze whether this many features are indeed needed. Using the classification of the BBOB test functions as testbed, we show that a surprisingly small number of features -- often less than four -- can suffice to achieve a 98\% accuracy. Interestingly, the number of features required to meet this threshold is found to decrease with the problem dimension. We show that the classification accuracy transfers to settings in which several instances are involved in training and testing. In the leave-one-instance-out setting, however, classification accuracy drops significantly, and the transformation-invariance of the features becomes a decisive success factor.

Related papers

Offline Model-Based Optimization: Comprehensive Review [61.91350077539443]
offline optimization is a fundamental challenge in science and engineering, where the goal is to optimize black-box functions using only offline datasets. Recent advances in model-based optimization have harnessed the generalization capabilities of deep neural networks to develop offline-specific surrogate and generative models. Despite its growing impact in accelerating scientific discovery, the field lacks a comprehensive review.
arXiv Detail & Related papers (2025-03-21T16:35:02Z)
A Hybrid Framework for Statistical Feature Selection and Image-Based Noise-Defect Detection [55.2480439325792]
This paper presents a hybrid framework that integrates both statistical feature selection and classification techniques to improve defect detection accuracy. We present around 55 distinguished features that are extracted from industrial images, which are then analyzed using statistical methods. By integrating these methods with flexible machine learning applications, the proposed framework improves detection accuracy and reduces false positives and misclassifications.
arXiv Detail & Related papers (2024-12-11T22:12:21Z)
Machine learning meets the CHSH scenario [0.0]
We focus on assessing the usefulness and effectiveness of the machine learning (ML) approach. We consider a wide selection of approaches, ranging from simple data science models to dense neural networks. We conclude that while it is relatively easy to achieve good performance on average, it is hard to train a model that performs well on the "hard" cases.
arXiv Detail & Related papers (2024-07-19T15:16:31Z)
LLM-Select: Feature Selection with Large Language Models [64.5099482021597]
Large language models (LLMs) are capable of selecting the most predictive features, with performance rivaling the standard tools of data science. Our findings suggest that LLMs may be useful not only for selecting the best features for training but also for deciding which features to collect in the first place.
arXiv Detail & Related papers (2024-07-02T22:23:40Z)
Leaving the Nest: Going Beyond Local Loss Functions for Predict-Then-Optimize [57.22851616806617]
We show that our method achieves state-of-the-art results in four domains from the literature. Our approach outperforms the best existing method by nearly 200% when the localness assumption is broken.
arXiv Detail & Related papers (2023-05-26T11:17:45Z)
Representation Learning with Multi-Step Inverse Kinematics: An Efficient and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity. We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level. Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z)
DoE2Vec: Deep-learning Based Features for Exploratory Landscape Analysis [0.0]
We propose DoE2Vec, a variational autoencoder (VAE)-based methodology to learn optimization landscape characteristics. Unlike the classical exploratory landscape analysis (ELA) method, our approach does not require any feature engineering. For validation, we inspect the quality of latent reconstructions and analyze the latent representations using different experiments.
arXiv Detail & Related papers (2023-03-31T09:38:44Z)
FAStEN: An Efficient Adaptive Method for Feature Selection and Estimation in High-Dimensional Functional Regressions [7.674715791336311]
We propose a new, flexible and ultra-efficient approach to perform feature selection in a sparse function-on-function regression problem. We show how to extend it to the scalar-on-function framework. We present an application to brain fMRI data from the AOMIC PIOP1 study.
arXiv Detail & Related papers (2023-03-26T19:41:17Z)
RF+clust for Leave-One-Problem-Out Performance Prediction [0.9281671380673306]
We study leave-one-problem-out (LOPO) performance prediction. We analyze whether standard random forest (RF) model predictions can be improved by calibrating them with a weighted average of performance values.
arXiv Detail & Related papers (2023-01-23T16:14:59Z)
Meta-Wrapper: Differentiable Wrapping Operator for User Interest Selection in CTR Prediction [97.99938802797377]
Click-through rate (CTR) prediction, whose goal is to predict the probability of the user to click on an item, has become increasingly significant in recommender systems. Recent deep learning models with the ability to automatically extract the user interest from his/her behaviors have achieved great success. We propose a novel approach under the framework of the wrapper method, which is named Meta-Wrapper.
arXiv Detail & Related papers (2022-06-28T03:28:15Z)
Few-shot Quality-Diversity Optimization [50.337225556491774]
Quality-Diversity (QD) optimization has been shown to be effective tools in dealing with deceptive minima and sparse rewards in Reinforcement Learning. We show that, given examples from a task distribution, information about the paths taken by optimization in parameter space can be leveraged to build a prior population, which when used to initialize QD methods in unseen environments, allows for few-shot adaptation. Experiments carried in both sparse and dense reward settings using robotic manipulation and navigation benchmarks show that it considerably reduces the number of generations that are required for QD optimization in these environments.
arXiv Detail & Related papers (2021-09-14T17:12:20Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
Feature Selection for Huge Data via Minipatch Learning [0.0]
We propose Stable Minipatch Selection (STAMPS) and Adaptive STAMPS. STAMPS are meta-algorithms that build ensembles of selection events of base feature selectors trained on tiny, (ly-adaptive) random subsets of both the observations and features of the data. Our approaches are general and can be employed with a variety of existing feature selection strategies and machine learning techniques.
arXiv Detail & Related papers (2020-10-16T17:41:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.