Related papers: Multivariate feature ranking of gene expression data

Multivariate feature ranking of gene expression data

URL: http://arxiv.org/abs/2111.02357v1
Date: Wed, 3 Nov 2021 17:19:53 GMT
Title: Multivariate feature ranking of gene expression data
Authors: Fernando Jim\'enez and Gracia S\'anchez Jos\'e Palma and Luis Miralles-Pechu\'an and Juan Bot\'ia
Abstract summary: We propose two new multivariate feature ranking methods based on pairwise correlation and pairwise consistency. We statistically prove that the proposed methods outperform the state of the art feature ranking methods Clustering Variation, Chi Squared, Correlation, Information Gain, ReliefF and Significance.
Score: 62.997667081978825
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Gene expression datasets are usually of high dimensionality and therefore require efficient and effective methods for identifying the relative importance of their attributes. Due to the huge size of the search space of the possible solutions, the attribute subset evaluation feature selection methods tend to be not applicable, so in these scenarios feature ranking methods are used. Most of the feature ranking methods described in the literature are univariate methods, so they do not detect interactions between factors. In this paper we propose two new multivariate feature ranking methods based on pairwise correlation and pairwise consistency, which we have applied in three gene expression classification problems. We statistically prove that the proposed methods outperform the state of the art feature ranking methods Clustering Variation, Chi Squared, Correlation, Information Gain, ReliefF and Significance, as well as feature selection methods of attribute subset evaluation based on correlation and consistency with multi-objective evolutionary search strategy.

Related papers

Break the Tie: Learning Cluster-Customized Category Relationships for Categorical Data Clustering [51.11677202873771]
Categorical attributes with qualitative values are ubiquitous in cluster analysis of real datasets.<n>Unlike the Euclidean distance of numerical attributes, the categorical attributes lack well-defined relationships of their possible values.<n>This paper breaks the intrinsic relationship tie of attribute categories and learns customized distance metrics suitable for flexibly revealing various cluster distributions.
arXiv Detail & Related papers (2025-11-12T06:57:24Z)
Permutation-based multi-objective evolutionary feature selection for high-dimensional data [43.18726655647964]
We propose a novel feature selection method for high-dimensional data, based on the well-known permutation feature importance approach. The proposed method employs a multi-objective evolutionary algorithm to search for candidate feature subsets. The effectiveness of our method has been validated on a set of 24 publicly available high-dimensional datasets.
arXiv Detail & Related papers (2025-01-24T08:11:28Z)
Supervised Pattern Recognition Involving Skewed Feature Densities [49.48516314472825]
The classification potential of the Euclidean distance and a dissimilarity index based on the coincidence similarity index are compared. The accuracy of classifying the intersection point between the densities of two adjacent groups is taken into account.
arXiv Detail & Related papers (2024-09-02T12:45:18Z)
Feature Selection as Deep Sequential Generative Learning [50.00973409680637]
We develop a deep variational transformer model over a joint of sequential reconstruction, variational, and performance evaluator losses. Our model can distill feature selection knowledge and learn a continuous embedding space to map feature selection decision sequences into embedding vectors associated with utility scores.
arXiv Detail & Related papers (2024-03-06T16:31:56Z)
Feature Selection via Robust Weighted Score for High Dimensional Binary Class-Imbalanced Gene Expression Data [1.2891210250935148]
A robust weighted score for unbalanced data (ROWSU) is proposed for selecting the most discriminative feature for high dimensional gene expression binary classification with class-imbalance problem. The performance of the proposed ROWSU method is evaluated on $6$ gene expression datasets.
arXiv Detail & Related papers (2024-01-23T11:22:03Z)
Convolutional autoencoder-based multimodal one-class classification [80.52334952912808]
One-class classification refers to approaches of learning using data from a single class only. We propose a deep learning one-class classification method suitable for multimodal data.
arXiv Detail & Related papers (2023-09-25T12:31:18Z)
On the utility of power spectral techniques with feature selection techniques for effective mental task classification in noninvasive BCI [19.19039983741124]
This paper proposes an approach to select relevant and non-redundant spectral features for the mental task classification. The findings demonstrate substantial improvements in the performance of the learning model for mental task classification.
arXiv Detail & Related papers (2021-11-16T00:27:53Z)
An Evolutionary Correlation-aware Feature Selection Method for Classification Problems [3.2550305883611244]
In this paper, an estimation of distribution algorithm is proposed to meet three goals. Firstly, as an extension of EDA, the proposed method generates only two individuals in each iteration that compete based on a fitness function. Secondly, we provide a guiding technique for determining the number of features for individuals in each iteration. As the main contribution of the paper, in addition to considering the importance of each feature alone, the proposed method can consider the interaction between features.
arXiv Detail & Related papers (2021-10-16T20:20:43Z)
Discriminative Attribution from Counterfactuals [64.94009515033984]
We present a method for neural network interpretability by combining feature attribution with counterfactual explanations. We show that this method can be used to quantitatively evaluate the performance of feature attribution methods in an objective manner.
arXiv Detail & Related papers (2021-09-28T00:53:34Z)
A Novel Community Detection Based Genetic Algorithm for Feature Selection [3.8848561367220276]
Authors propose a genetic algorithm based on community detection, which functions in three steps. Nine benchmark classification problems were analyzed in terms of the performance of the presented approach.
arXiv Detail & Related papers (2020-08-08T15:39:30Z)
Deep Learning feature selection to unhide demographic recommender systems factors [63.732639864601914]
The matrix factorization model generates factors which do not incorporate semantic knowledge. DeepUnHide is able to extract demographic information from the users and items factors in collaborative filtering recommender systems.
arXiv Detail & Related papers (2020-06-17T17:36:48Z)
On-the-Fly Joint Feature Selection and Classification [16.84451472788859]
We propose a framework to perform joint feature selection and classification on-the-fly. We derive the optimum solution of the associated optimization problem and analyze its structure. We evaluate the performance of the proposed algorithms on several public datasets.
arXiv Detail & Related papers (2020-04-21T19:19:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.