Fast Genetic Algorithm for feature selection -- A qualitative approximation approach
- URL: http://arxiv.org/abs/2404.03996v1
- Date: Fri, 5 Apr 2024 10:15:24 GMT
- Title: Fast Genetic Algorithm for feature selection -- A qualitative approximation approach
- Authors: Mohammed Ghaith Altarabichi, SÅ‚awomir Nowaczyk, Sepideh Pashami, Peyman Sheikholharam Mashhadi,
- Abstract summary: We propose a two-stage surrogate-assisted evolutionary approach to address the computational issues arising from using Genetic Algorithm (GA) for feature selection.
We show that CHCQX converges faster to feature subset solutions of significantly higher accuracy, particularly for large datasets with over 100K instances.
- Score: 5.279268784803583
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Evolutionary Algorithms (EAs) are often challenging to apply in real-world settings since evolutionary computations involve a large number of evaluations of a typically expensive fitness function. For example, an evaluation could involve training a new machine learning model. An approximation (also known as meta-model or a surrogate) of the true function can be used in such applications to alleviate the computation cost. In this paper, we propose a two-stage surrogate-assisted evolutionary approach to address the computational issues arising from using Genetic Algorithm (GA) for feature selection in a wrapper setting for large datasets. We define 'Approximation Usefulness' to capture the necessary conditions to ensure correctness of the EA computations when an approximation is used. Based on this definition, we propose a procedure to construct a lightweight qualitative meta-model by the active selection of data instances. We then use a meta-model to carry out the feature selection task. We apply this procedure to the GA-based algorithm CHC (Cross generational elitist selection, Heterogeneous recombination and Cataclysmic mutation) to create a Qualitative approXimations variant, CHCQX. We show that CHCQX converges faster to feature subset solutions of significantly higher accuracy (as compared to CHC), particularly for large datasets with over 100K instances. We also demonstrate the applicability of the thinking behind our approach more broadly to Swarm Intelligence (SI), another branch of the Evolutionary Computation (EC) paradigm with results of PSOQX, a qualitative approximation adaptation of the Particle Swarm Optimization (PSO) method. A GitHub repository with the complete implementation is available.
Related papers
- Computation-Aware Gaussian Processes: Model Selection And Linear-Time Inference [55.150117654242706]
We show that model selection for computation-aware GPs trained on 1.8 million data points can be done within a few hours on a single GPU.
As a result of this work, Gaussian processes can be trained on large-scale datasets without significantly compromising their ability to quantify uncertainty.
arXiv Detail & Related papers (2024-11-01T21:11:48Z) - Simulation-based optimization of a production system topology -- a
neural network-assisted genetic algorithm [0.0]
A novel approach is presented for topology optimization of production systems using a genetic algorithm (GA)
An extension to the GA is presented in which a neural network functions as a surrogate model for simulation.
Both approaches are effective at finding the optimal solution in industrial settings.
arXiv Detail & Related papers (2024-02-02T15:52:10Z) - Benchmarking Differential Evolution on a Quantum Simulator [0.0]
Differential Evolution (DE) can be used to compute the minima of functions such as the rastrigin function and rosenbrock function.
This work is an attempt to study the result of applying the DE method on these functions with candidate individuals generated on classical Turing modeled computation.
arXiv Detail & Related papers (2023-11-06T14:27:00Z) - Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML.
This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Surrogate modeling for Bayesian optimization beyond a single Gaussian
process [62.294228304646516]
We propose a novel Bayesian surrogate model to balance exploration with exploitation of the search space.
To endow function sampling with scalability, random feature-based kernel approximation is leveraged per GP model.
To further establish convergence of the proposed EGP-TS to the global optimum, analysis is conducted based on the notion of Bayesian regret.
arXiv Detail & Related papers (2022-05-27T16:43:10Z) - Fast Feature Selection with Fairness Constraints [49.142308856826396]
We study the fundamental problem of selecting optimal features for model construction.
This problem is computationally challenging on large datasets, even with the use of greedy algorithm variants.
We extend the adaptive query model, recently proposed for the greedy forward selection for submodular functions, to the faster paradigm of Orthogonal Matching Pursuit for non-submodular functions.
The proposed algorithm achieves exponentially fast parallel run time in the adaptive query model, scaling much better than prior work.
arXiv Detail & Related papers (2022-02-28T12:26:47Z) - Surrogate-Assisted Genetic Algorithm for Wrapper Feature Selection [4.89253144446913]
We propose a novel multi-stage feature selection framework utilizing multiple levels of approximations, or surrogates.
Our experiments show that SAGA can arrive at near-optimal solutions three times faster than a wrapper GA, on average.
arXiv Detail & Related papers (2021-11-17T12:33:18Z) - A Nature-Inspired Feature Selection Approach based on Hypercomplex
Information [4.733222697135021]
We introduce a meta-heuristic optimization framework in a hypercomplex-based feature selection.
The intended hypercomplex feature selection is tested for several meta-heuristic algorithms and hypercomplex representations.
The good results achieved by the proposed approach make it a promising tool amongst feature selection research.
arXiv Detail & Related papers (2021-01-14T15:05:13Z) - AdaLead: A simple and robust adaptive greedy search algorithm for
sequence design [55.41644538483948]
We develop an easy-to-directed, scalable, and robust evolutionary greedy algorithm (AdaLead)
AdaLead is a remarkably strong benchmark that out-competes more complex state of the art approaches in a variety of biologically motivated sequence design challenges.
arXiv Detail & Related papers (2020-10-05T16:40:38Z) - GeneCAI: Genetic Evolution for Acquiring Compact AI [36.04715576228068]
Deep Neural Networks (DNNs) are evolving towards more complex architectures to achieve higher inference accuracy.
Model compression techniques can be leveraged to efficiently deploy such compute-intensive architectures on resource-limited mobile devices.
This paper introduces GeneCAI, a novel optimization method that automatically learns how to tune per-layer compression hyper- parameters.
arXiv Detail & Related papers (2020-04-08T20:56:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.