Compact NSGA-II for Multi-objective Feature Selection
- URL: http://arxiv.org/abs/2402.12625v1
- Date: Tue, 20 Feb 2024 01:10:12 GMT
- Title: Compact NSGA-II for Multi-objective Feature Selection
- Authors: Sevil Zanjani Miyandoab, Shahryar Rahnamayan, Azam Asilian Bidgoli
- Abstract summary: We define feature selection as a multi-objective binary optimization task with the objectives of maximizing classification accuracy and minimizing the number of selected features.
In order to select optimal features, we have proposed a binary Compact NSGA-II (CNSGA-II) algorithm.
To the best of our knowledge, this is the first compact multi-objective algorithm proposed for feature selection.
- Score: 0.24578723416255746
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Feature selection is an expensive challenging task in machine learning and
data mining aimed at removing irrelevant and redundant features. This
contributes to an improvement in classification accuracy, as well as the budget
and memory requirements for classification, or any other post-processing task
conducted after feature selection. In this regard, we define feature selection
as a multi-objective binary optimization task with the objectives of maximizing
classification accuracy and minimizing the number of selected features. In
order to select optimal features, we have proposed a binary Compact NSGA-II
(CNSGA-II) algorithm. Compactness represents the population as a probability
distribution to enhance evolutionary algorithms not only to be more
memory-efficient but also to reduce the number of fitness evaluations. Instead
of holding two populations during the optimization process, our proposed method
uses several Probability Vectors (PVs) to generate new individuals. Each PV
efficiently explores a region of the search space to find non-dominated
solutions instead of generating candidate solutions from a small population as
is the common approach in most evolutionary algorithms. To the best of our
knowledge, this is the first compact multi-objective algorithm proposed for
feature selection. The reported results for expensive optimization cases with a
limited budget on five datasets show that the CNSGA-II performs more
efficiently than the well-known NSGA-II method in terms of the hypervolume (HV)
performance metric requiring less memory. The proposed method and experimental
results are explained and analyzed in detail.
Related papers
- Large-scale Multi-objective Feature Selection: A Multi-phase Search Space Shrinking Approach [0.27624021966289597]
Feature selection is a crucial step in machine learning, especially for high-dimensional datasets.
This paper proposes a novel large-scale multi-objective evolutionary algorithm based on the search space shrinking, termed LMSSS.
The effectiveness of the proposed algorithm is demonstrated through comprehensive experiments on 15 large-scale datasets.
arXiv Detail & Related papers (2024-10-13T23:06:10Z) - Multi-objective Binary Coordinate Search for Feature Selection [0.24578723416255746]
We propose the binary multi-objective coordinate search (MOCS) algorithm to solve large-scale feature selection problems.
Results indicate the significant superiority of our method over NSGA-II, on five real-world large-scale datasets.
arXiv Detail & Related papers (2024-02-20T00:50:26Z) - KECOR: Kernel Coding Rate Maximization for Active 3D Object Detection [48.66703222700795]
We resort to a novel kernel strategy to identify the most informative point clouds to acquire labels.
To accommodate both one-stage (i.e., SECOND) and two-stage detectors, we incorporate the classification entropy tangent and well trade-off between detection performance and the total number of bounding boxes selected for annotation.
Our results show that approximately 44% box-level annotation costs and 26% computational time are reduced compared to the state-of-the-art method.
arXiv Detail & Related papers (2023-07-16T04:27:03Z) - SFE: A Simple, Fast and Efficient Feature Selection Algorithm for
High-Dimensional Data [8.190527783858096]
The SFE algorithm performs its search process using a search agent and two operators: non-selection and selection.
The efficiency and effectiveness of the SFE and the SFE-PSO for feature selection are compared on 40 high-dimensional datasets.
arXiv Detail & Related papers (2023-03-17T12:28:17Z) - Feature selection algorithm based on incremental mutual information and
cockroach swarm optimization [12.297966427336124]
We propose an incremental mutual information based improved swarm intelligent optimization method (IMIICSO)
This method extracts decision table reduction knowledge to guide group algorithm global search.
The accuracy of feature subsets selected by the improved cockroach swarm algorithm based on incremental mutual information is better or almost the same as that of the original swarm intelligent optimization algorithm.
arXiv Detail & Related papers (2023-02-21T08:51:05Z) - Massively Parallel Genetic Optimization through Asynchronous Propagation
of Populations [50.591267188664666]
Propulate is an evolutionary optimization algorithm and software package for global optimization.
We provide an MPI-based implementation of our algorithm, which features variants of selection, mutation, crossover, and migration.
We find that Propulate is up to three orders of magnitude faster without sacrificing solution accuracy.
arXiv Detail & Related papers (2023-01-20T18:17:34Z) - Tree ensemble kernels for Bayesian optimization with known constraints
over mixed-feature spaces [54.58348769621782]
Tree ensembles can be well-suited for black-box optimization tasks such as algorithm tuning and neural architecture search.
Two well-known challenges in using tree ensembles for black-box optimization are (i) effectively quantifying model uncertainty for exploration and (ii) optimizing over the piece-wise constant acquisition function.
Our framework performs as well as state-of-the-art methods for unconstrained black-box optimization over continuous/discrete features and outperforms competing methods for problems combining mixed-variable feature spaces and known input constraints.
arXiv Detail & Related papers (2022-07-02T16:59:37Z) - Compactness Score: A Fast Filter Method for Unsupervised Feature
Selection [66.84571085643928]
We propose a fast unsupervised feature selection method, named as, Compactness Score (CSUFS) to select desired features.
Our proposed algorithm seems to be more accurate and efficient compared with existing algorithms.
arXiv Detail & Related papers (2022-01-31T13:01:37Z) - Random Features for the Neural Tangent Kernel [57.132634274795066]
We propose an efficient feature map construction of the Neural Tangent Kernel (NTK) of fully-connected ReLU network.
We show that dimension of the resulting features is much smaller than other baseline feature map constructions to achieve comparable error bounds both in theory and practice.
arXiv Detail & Related papers (2021-04-03T09:08:12Z) - BOP-Elites, a Bayesian Optimisation algorithm for Quality-Diversity
search [0.0]
We propose the Bayesian optimisation of Elites (BOP-Elites) algorithm.
By considering user defined regions of the feature space as 'niches' our task is to find the optimal solution in each niche.
The resulting algorithm is very effective in identifying the parts of the search space that belong to a niche in feature space, and finding the optimal solution in each niche.
arXiv Detail & Related papers (2020-05-08T23:49:13Z) - Stepwise Model Selection for Sequence Prediction via Deep Kernel
Learning [100.83444258562263]
We propose a novel Bayesian optimization (BO) algorithm to tackle the challenge of model selection in this setting.
In order to solve the resulting multiple black-box function optimization problem jointly and efficiently, we exploit potential correlations among black-box functions.
We are the first to formulate the problem of stepwise model selection (SMS) for sequence prediction, and to design and demonstrate an efficient joint-learning algorithm for this purpose.
arXiv Detail & Related papers (2020-01-12T09:42:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.