Related papers: A New Gene Selection Algorithm using Fuzzy-Rough Set Theory for Tumor Classification

A New Gene Selection Algorithm using Fuzzy-Rough Set Theory for Tumor Classification

URL: http://arxiv.org/abs/2003.12386v1
Date: Thu, 26 Mar 2020 13:43:25 GMT
Title: A New Gene Selection Algorithm using Fuzzy-Rough Set Theory for Tumor Classification
Authors: Seyedeh Faezeh Farahbakhshian, Milad Taleby Ahvanooey
Abstract summary: We present a new technique for gene selection using a discernibility matrix of fuzzy-rough sets. The proposed technique takes into account the similarity of those instances that have the same and different class labels to improve the gene selection results. Experimental results demonstrate that this technique provides better efficiency compared to the state-of-the-art approaches.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In statistics and machine learning, feature selection is the process of picking a subset of relevant attributes for utilizing in a predictive model. Recently, rough set-based feature selection techniques, that employ feature dependency to perform selection process, have been drawn attention. Classification of tumors based on gene expression is utilized to diagnose proper treatment and prognosis of the disease in bioinformatics applications. Microarray gene expression data includes superfluous feature genes of high dimensionality and smaller training instances. Since exact supervised classification of gene expression instances in such high-dimensional problems is very complex, the selection of appropriate genes is a crucial task for tumor classification. In this study, we present a new technique for gene selection using a discernibility matrix of fuzzy-rough sets. The proposed technique takes into account the similarity of those instances that have the same and different class labels to improve the gene selection results, while the state-of-the art previous approaches only address the similarity of instances with different class labels. To meet that requirement, we extend the Johnson reducer technique into the fuzzy case. Experimental results demonstrate that this technique provides better efficiency compared to the state-of-the-art approaches.

Related papers

GRAPE: Heterogeneous Graph Representation Learning for Genetic Perturbation with Coding and Non-Coding Biotype [51.58774936662233]
Building gene regulatory networks (GRN) is essential to understand and predict the effects of genetic perturbations.<n>In this work, we leverage pre-trained large language model and DNA sequence model to extract features from gene descriptions and DNA sequence data.<n>We introduce gene biotype information for the first time in genetic perturbation, simulating the distinct roles of genes with different biotypes in regulating cellular processes.
arXiv Detail & Related papers (2025-05-06T03:35:24Z)
BOLIMES: Boruta and LIME optiMized fEature Selection for Gene Expression Classification [0.0937465283958018]
BOLIMES is a novel feature selection algorithm designed to enhance gene expression classification. It combines exhaustive feature selection with interpretability-driven refinement, offering a powerful solution for high-dimensional gene expression analysis.
arXiv Detail & Related papers (2025-02-18T17:33:41Z)
An Evolutional Neural Network Framework for Classification of Microarray Data [0.0]
This research aims to apply a hybrid model of Genetic Algorithm and Neural Network to overcome the problem during subset selection of informative genes. Experimental results show the proposed method suggested high accuracy and minimum number of selected genes in comparison with other machine learning algorithms.
arXiv Detail & Related papers (2024-11-20T13:48:40Z)
Predicting Genetic Mutation from Whole Slide Images via Biomedical-Linguistic Knowledge Enhanced Multi-label Classification [119.13058298388101]
We develop a Biological-knowledge enhanced PathGenomic multi-label Transformer to improve genetic mutation prediction performances. BPGT first establishes a novel gene encoder that constructs gene priors by two carefully designed modules. BPGT then designs a label decoder that finally performs genetic mutation prediction by two tailored modules.
arXiv Detail & Related papers (2024-06-05T06:42:27Z)
A Comparative Analysis of Gene Expression Profiling by Statistical and Machine Learning Approaches [1.8954222800767324]
We discuss the biological and the methodological limitations of machine learning models to classify cancer samples. Gene rankings are obtained from explainability methods adapted to these models. We observe that the information learned by black-box neural networks is related to the notion of differential expression.
arXiv Detail & Related papers (2024-02-01T18:17:36Z)
Feature Selection via Robust Weighted Score for High Dimensional Binary Class-Imbalanced Gene Expression Data [1.2891210250935148]
A robust weighted score for unbalanced data (ROWSU) is proposed for selecting the most discriminative feature for high dimensional gene expression binary classification with class-imbalance problem. The performance of the proposed ROWSU method is evaluated on $6$ gene expression datasets.
arXiv Detail & Related papers (2024-01-23T11:22:03Z)
Single-Cell Deep Clustering Method Assisted by Exogenous Gene Information: A Novel Approach to Identifying Cell Types [50.55583697209676]
We develop an attention-enhanced graph autoencoder, which is designed to efficiently capture the topological features between cells. During the clustering process, we integrated both sets of information and reconstructed the features of both cells and genes to generate a discriminative representation. This research offers enhanced insights into the characteristics and distribution of cells, thereby laying the groundwork for early diagnosis and treatment of diseases.
arXiv Detail & Related papers (2023-11-28T09:14:55Z)
rfPhen2Gen: A machine learning based association study of brain imaging phenotypes to genotypes [71.1144397510333]
We learned machine learning models to predict SNPs using 56 brain imaging QTs. SNPs within the known Alzheimer disease (AD) risk gene APOE had lowest RMSE for lasso and random forest. Random forests identified additional SNPs that were not prioritized by the linear models but are known to be associated with brain-related disorders.
arXiv Detail & Related papers (2022-03-31T20:15:22Z)
Multivariate feature ranking of gene expression data [62.997667081978825]
We propose two new multivariate feature ranking methods based on pairwise correlation and pairwise consistency. We statistically prove that the proposed methods outperform the state of the art feature ranking methods Clustering Variation, Chi Squared, Correlation, Information Gain, ReliefF and Significance.
arXiv Detail & Related papers (2021-11-03T17:19:53Z)
Cancer Gene Profiling through Unsupervised Discovery [49.28556294619424]
We introduce a novel, automatic and unsupervised framework to discover low-dimensional gene biomarkers. Our method is based on the LP-Stability algorithm, a high dimensional center-based unsupervised clustering algorithm. Our signature reports promising results on distinguishing immune inflammatory and immune desert tumors.
arXiv Detail & Related papers (2021-02-11T09:04:45Z)
Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients. We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks. Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)
Unsupervised Feature Selection for Tumor Profiles using Autoencoders and Kernel Methods [1.9078991171384014]
This work aims to learn meaningful and low dimensional representations of tumor samples and find tumor subtype clusters. The proposed method named Latent Kernel Feature Selection (LKFS) is an unsupervised approach for gene selection in tumor gene expression profiles.
arXiv Detail & Related papers (2020-07-12T21:59:05Z)
Latent regularization for feature selection using kernel methods in tumor classification [1.9078991171384014]
Feature selection is a useful approach to select the key genes which helps to classify tumors. We propose a feature selection method based on Multiple Kernel Learning that results in a reduced subset of genes and a custom kernel. An improvement of the generalization capacity is obtained and assessed by the tumor classification performance on new unseen test samples.
arXiv Detail & Related papers (2020-04-10T00:46:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.