A Study of Feature Selection and Extraction Algorithms for Cancer
Subtype Prediction
- URL: http://arxiv.org/abs/2109.14648v1
- Date: Wed, 29 Sep 2021 18:11:24 GMT
- Title: A Study of Feature Selection and Extraction Algorithms for Cancer
Subtype Prediction
- Authors: Vaibhav Sinha, Siladitya Dash, Nazma Naskar, and Sk Md Mosaddek
Hossain
- Abstract summary: We show that the existing feature selection methods are computationally expensive when applied individually.
We apply these algorithms sequentially which helps in lowering the computational cost and improving the predictive performance.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In this work, we study and analyze different feature selection algorithms
that can be used to classify cancer subtypes in case of highly varying
high-dimensional data. We apply three different feature selection methods on
five different types of cancers having two separate omics each. We show that
the existing feature selection methods are computationally expensive when
applied individually. Instead, we apply these algorithms sequentially which
helps in lowering the computational cost and improving the predictive
performance. We further show that reducing the number of features using some
dimension reduction techniques can improve the performance of machine learning
models in some cases. We support our findings through comprehensive data
analysis and visualization.
Related papers
- Feature Selection as Deep Sequential Generative Learning [50.00973409680637]
We develop a deep variational transformer model over a joint of sequential reconstruction, variational, and performance evaluator losses.
Our model can distill feature selection knowledge and learn a continuous embedding space to map feature selection decision sequences into embedding vectors associated with utility scores.
arXiv Detail & Related papers (2024-03-06T16:31:56Z) - Different Algorithms (Might) Uncover Different Patterns: A Brain-Age
Prediction Case Study [8.597209503064128]
We investigate whether established hypotheses in brain-age prediction from EEG research validate across algorithms.
Few of our models achieved state-of-the-art performance on the specific data-set we utilized.
arXiv Detail & Related papers (2024-02-08T19:55:07Z) - A Contrast Based Feature Selection Algorithm for High-dimensional Data
set in Machine Learning [9.596923373834093]
We propose a novel filter feature selection method, ContrastFS, which selects discriminative features based on the discrepancies features shown between different classes.
We validate effectiveness and efficiency of our approach on several widely studied benchmark datasets, results show that the new method performs favorably with negligible computation.
arXiv Detail & Related papers (2024-01-15T05:32:35Z) - Dual-stage optimizer for systematic overestimation adjustment applied to
multi-objective genetic algorithms for biomarker selection [0.18648070031379424]
Biomarker identification with feature selection methods can be addressed as a multi-objective problem with trade-offs between predictive ability and parsimony in the number of features.
We propose DOSA-MO, a novel multi-objective optimization wrapper algorithm that learns how the original estimation, its variance, and the feature set size of the solutions predict the overestimation.
arXiv Detail & Related papers (2023-12-27T16:13:14Z) - Compactness Score: A Fast Filter Method for Unsupervised Feature
Selection [66.84571085643928]
We propose a fast unsupervised feature selection method, named as, Compactness Score (CSUFS) to select desired features.
Our proposed algorithm seems to be more accurate and efficient compared with existing algorithms.
arXiv Detail & Related papers (2022-01-31T13:01:37Z) - Expressivity of Parameterized and Data-driven Representations in Quality
Diversity Search [111.06379262544911]
We compare the output diversity of a quality diversity evolutionary search performed in two different search spaces.
A learned model is better at interpolating between known data points than at extrapolating or expanding towards unseen examples.
arXiv Detail & Related papers (2021-05-10T10:27:43Z) - Auto-weighted Multi-view Feature Selection with Graph Optimization [90.26124046530319]
We propose a novel unsupervised multi-view feature selection model based on graph learning.
The contributions are threefold: (1) during the feature selection procedure, the consensus similarity graph shared by different views is learned.
Experiments on various datasets demonstrate the superiority of the proposed method compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-04-11T03:25:25Z) - Feature Selection Using Reinforcement Learning [0.0]
The space of variables or features that can be used to characterize a particular predictor of interest continues to grow exponentially.
Identifying the most characterizing features that minimizes the variance without jeopardizing the bias of our models is critical to successfully training a machine learning model.
arXiv Detail & Related papers (2021-01-23T09:24:37Z) - Theoretical Insights Into Multiclass Classification: A High-dimensional
Asymptotic View [82.80085730891126]
We provide the first modernally precise analysis of linear multiclass classification.
Our analysis reveals that the classification accuracy is highly distribution-dependent.
The insights gained may pave the way for a precise understanding of other classification algorithms.
arXiv Detail & Related papers (2020-11-16T05:17:29Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - Analysis of ensemble feature selection for correlated high-dimensional
RNA-Seq cancer data [0.24366811507669126]
This study compares two approaches for the discovery of relevant variables.
The most informative features are identified using a four feature selection algorithms.
Unfortunately, models built on feature sets obtained from the ensemble of feature selection algorithms were no better than for models developed on feature sets obtained from individual algorithms.
arXiv Detail & Related papers (2020-04-28T20:38:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.