Feature reduction for machine learning on molecular features: The
GeneScore
- URL: http://arxiv.org/abs/2101.05546v1
- Date: Thu, 14 Jan 2021 10:58:39 GMT
- Title: Feature reduction for machine learning on molecular features: The
GeneScore
- Authors: Alexander Denker, Anastasia Steshina, Theresa Grooss, Frank Ueckert,
Sylvia N\"urnberg
- Abstract summary: The GeneScore is a concept of feature reduction for Machine Learning analysis of biomedical data.
We show that the GeneScore is superior to a binary matrix in the classification of cancer entities.
- Score: 58.720142291102135
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We present the GeneScore, a concept of feature reduction for Machine Learning
analysis of biomedical data. Using expert knowledge, the GeneScore integrates
different molecular data types into a single score. We show that the GeneScore
is superior to a binary matrix in the classification of cancer entities from
SNV, Indel, CNV, gene fusion and gene expression data. The GeneScore is a
straightforward way to facilitate state-of-the-art analysis, while making use
of the available scientific knowledge on the nature of molecular data features
used.
Related papers
- Precision Cancer Classification and Biomarker Identification from mRNA Gene Expression via Dimensionality Reduction and Explainable AI [0.9423257767158634]
This research presents a comprehensive pipeline designed to accurately identify 33 distinct cancer types and their corresponding gene sets.
It incorporates a combination of normalization and feature selection techniques to reduce dataset dimensionality effectively.
We leverage Explainable AI to elucidate the biological significance of the identified cancer-specific genes.
arXiv Detail & Related papers (2024-10-08T18:56:31Z) - Predicting Genetic Mutation from Whole Slide Images via Biomedical-Linguistic Knowledge Enhanced Multi-label Classification [119.13058298388101]
We develop a Biological-knowledge enhanced PathGenomic multi-label Transformer to improve genetic mutation prediction performances.
BPGT first establishes a novel gene encoder that constructs gene priors by two carefully designed modules.
BPGT then designs a label decoder that finally performs genetic mutation prediction by two tailored modules.
arXiv Detail & Related papers (2024-06-05T06:42:27Z) - GeneAgent: Self-verification Language Agent for Gene Set Knowledge Discovery using Domain Databases [5.831842925038342]
We present GeneAgent, a first-of-its-kind language agent featuring self-verification capability.
It autonomously interacts with various biological databases to improve accuracy and reduce hallucination occurrences.
Benchmarking on 1,106 gene sets from different sources, GeneAgent consistently outperforms standard GPT-4 by a significant margin.
arXiv Detail & Related papers (2024-05-25T12:35:15Z) - Single-Cell Deep Clustering Method Assisted by Exogenous Gene
Information: A Novel Approach to Identifying Cell Types [50.55583697209676]
We develop an attention-enhanced graph autoencoder, which is designed to efficiently capture the topological features between cells.
During the clustering process, we integrated both sets of information and reconstructed the features of both cells and genes to generate a discriminative representation.
This research offers enhanced insights into the characteristics and distribution of cells, thereby laying the groundwork for early diagnosis and treatment of diseases.
arXiv Detail & Related papers (2023-11-28T09:14:55Z) - Fuzzy Gene Selection and Cancer Classification Based on Deep Learning
Model [1.3072222152900117]
We developed a new fuzzy gene selection technique (FGS) to identify informative genes to facilitate cancer classification.
With our FGS-enhanced method, the cancer classification model achieved 96.5%,96.2%,96%, and 95.9% for accuracy, precision, recall, and f1-score respectively.
In examining the six datasets that were used, the proposed model demonstrates it's capacity to classify cancer effectively.
arXiv Detail & Related papers (2023-05-04T21:52:57Z) - Machine Learning Methods for Cancer Classification Using Gene Expression
Data: A Review [77.34726150561087]
Cancer is the second major cause of death after cardiovascular diseases.
Gene expression can play a fundamental role in the early detection of cancer.
This study reviews recent progress in gene expression analysis for cancer classification using machine learning methods.
arXiv Detail & Related papers (2023-01-28T15:03:03Z) - Cancer Gene Profiling through Unsupervised Discovery [49.28556294619424]
We introduce a novel, automatic and unsupervised framework to discover low-dimensional gene biomarkers.
Our method is based on the LP-Stability algorithm, a high dimensional center-based unsupervised clustering algorithm.
Our signature reports promising results on distinguishing immune inflammatory and immune desert tumors.
arXiv Detail & Related papers (2021-02-11T09:04:45Z) - SimpleChrome: Encoding of Combinatorial Effects for Predicting Gene
Expression [8.326669256957352]
We present SimpleChrome, a deep learning model that learns the histone modification representations of genes.
The features learned from the model allow us to better understand the latent effects of cross-gene interactions and direct gene regulation on the target gene expression.
arXiv Detail & Related papers (2020-12-15T23:30:36Z) - Topological Data Analysis of copy number alterations in cancer [70.85487611525896]
We explore the potential to capture information contained in cancer genomic information using a novel topology-based approach.
We find that this technique has the potential to extract meaningful low-dimensional representations in cancer somatic genetic data.
arXiv Detail & Related papers (2020-11-22T17:31:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.