Gene-SGAN: a method for discovering disease subtypes with imaging and
genetic signatures via multi-view weakly-supervised deep clustering
- URL: http://arxiv.org/abs/2301.10772v1
- Date: Wed, 25 Jan 2023 10:08:30 GMT
- Title: Gene-SGAN: a method for discovering disease subtypes with imaging and
genetic signatures via multi-view weakly-supervised deep clustering
- Authors: Zhijian Yang, Junhao Wen, Ahmed Abdulkadir, Yuhan Cui, Guray Erus,
Elizabeth Mamourian, Randa Melhem, Dhivya Srinivasan, Sindhuja T.
Govindarajan, Jiong Chen, Mohamad Habes, Colin L. Masters, Paul Maruff,
Jurgen Fripp, Luigi Ferrucci, Marilyn S. Albert, Sterling C. Johnson, John C.
Morris, Pamela LaMontagne, Daniel S. Marcus, Tammie L. S. Benzinger, David A.
Wolk, Li Shen, Jingxuan Bao, Susan M. Resnick, Haochang Shou, Ilya M.
Nasrallah, Christos Davatzikos
- Abstract summary: Gene-SGAN is a multi-view, weakly-supervised deep clustering method.
It dissects disease heterogeneity by jointly considering phenotypic and genetic data.
Gene-SGAN is broadly applicable to disease subtyping and endophenotype discovery.
- Score: 6.79528256151419
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Disease heterogeneity has been a critical challenge for precision diagnosis
and treatment, especially in neurologic and neuropsychiatric diseases. Many
diseases can display multiple distinct brain phenotypes across individuals,
potentially reflecting disease subtypes that can be captured using MRI and
machine learning methods. However, biological interpretability and treatment
relevance are limited if the derived subtypes are not associated with genetic
drivers or susceptibility factors. Herein, we describe Gene-SGAN - a
multi-view, weakly-supervised deep clustering method - which dissects disease
heterogeneity by jointly considering phenotypic and genetic data, thereby
conferring genetic correlations to the disease subtypes and associated
endophenotypic signatures. We first validate the generalizability,
interpretability, and robustness of Gene-SGAN in semi-synthetic experiments. We
then demonstrate its application to real multi-site datasets from 28,858
individuals, deriving subtypes of Alzheimer's disease and brain endophenotypes
associated with hypertension, from MRI and SNP data. Derived brain phenotypes
displayed significant differences in neuroanatomical patterns, genetic
determinants, biological and clinical biomarkers, indicating potentially
distinct underlying neuropathologic processes, genetic drivers, and
susceptibility factors. Overall, Gene-SGAN is broadly applicable to disease
subtyping and endophenotype discovery, and is herein tested on disease-related,
genetically-driven neuroimaging phenotypes.
Related papers
- G2PDiffusion: Genotype-to-Phenotype Prediction with Diffusion Models [108.94237816552024]
This paper introduces G2PDiffusion, the first-of-its-kind diffusion model designed for genotype-to-phenotype generation across multiple species.
We use images to represent morphological phenotypes across species and redefine phenotype prediction as conditional image generation.
arXiv Detail & Related papers (2025-02-07T06:16:31Z) - Survey and Improvement Strategies for Gene Prioritization with Large Language Models [61.24568051916653]
Large language models (LLMs) have performed well in medical exams, but their effectiveness in diagnosing rare genetic diseases has not been assessed.
We used multi-agent and Human Phenotype Ontology (HPO) classification to categorized patients based on phenotypes and solvability levels.
At baseline, GPT-4 outperformed other LLMs, achieving near 30% accuracy in ranking causal genes correctly.
arXiv Detail & Related papers (2025-01-30T23:03:03Z) - Identifying latent disease factors differently expressed in patient subgroups using group factor analysis [54.67330718129736]
We propose a novel approach to uncover subgroup-specific and subgroup-common latent factors.
The proposed approach, sparse Group Factor Analysis (GFA) with regularised horseshoe priors, was implemented with probabilistic programming.
arXiv Detail & Related papers (2024-10-10T13:12:14Z) - Interpreting artificial neural networks to detect genome-wide association signals for complex traits [0.0]
We trained artificial neural networks to predict complex traits using both simulated and real genotype-phenotype datasets.
We detected multiple loci associated with schizophrenia.
arXiv Detail & Related papers (2024-07-26T15:20:42Z) - Dimensional Neuroimaging Endophenotypes: Neurobiological Representations
of Disease Heterogeneity Through Machine Learning [11.653182438505558]
We first present a systematic literature overview of studies using machine learning and multimodal MRI to unravel disease heterogeneity in various neuropsychiatric and neurodegenerative disorders.
We then summarize relevant machine learning methodologies and discuss an emerging paradigm which we call dimensional neuroimaging endophenotype (DNE)
DNE dissects the neurobiological heterogeneity of neuropsychiatric and neurodegenerative disorders into a low dimensional yet informative, quantitative brain phenotypic representation.
arXiv Detail & Related papers (2024-01-17T16:31:48Z) - GestaltMML: Enhancing Rare Genetic Disease Diagnosis through Multimodal Machine Learning Combining Facial Images and Clinical Texts [8.805728428427457]
We introduce a multimodal machine learning (MML) approach solely based on the Transformer architecture.
It integrates facial images, demographic information (age, sex, ethnicity), and clinical notes to improve prediction accuracy.
arXiv Detail & Related papers (2023-12-23T18:40:25Z) - Machine Learning Methods for Cancer Classification Using Gene Expression
Data: A Review [77.34726150561087]
Cancer is the second major cause of death after cardiovascular diseases.
Gene expression can play a fundamental role in the early detection of cancer.
This study reviews recent progress in gene expression analysis for cancer classification using machine learning methods.
arXiv Detail & Related papers (2023-01-28T15:03:03Z) - Unsupervised ensemble-based phenotyping helps enhance the
discoverability of genes related to heart morphology [57.25098075813054]
We propose a new framework for gene discovery entitled Un Phenotype Ensembles.
It builds a redundant yet highly expressive representation by pooling a set of phenotypes learned in an unsupervised manner.
These phenotypes are then analyzed via (GWAS), retaining only highly confident and stable associations.
arXiv Detail & Related papers (2023-01-07T18:36:44Z) - Pathology Steered Stratification Network for Subtype Identification in
Alzheimer's Disease [7.594681424335177]
Alzheimers disease (AD) is a heterogeneous, multitemporal neurodegenerative disorder characterized by beta-amyloid, pathologic tau, and neurodegeneration.
We propose a novel pathology steered stratification network (PSSN) that incorporates established domain knowledge in AD pathology through a reaction-diffusion model.
arXiv Detail & Related papers (2022-10-12T02:52:00Z) - rfPhen2Gen: A machine learning based association study of brain imaging
phenotypes to genotypes [71.1144397510333]
We learned machine learning models to predict SNPs using 56 brain imaging QTs.
SNPs within the known Alzheimer disease (AD) risk gene APOE had lowest RMSE for lasso and random forest.
Random forests identified additional SNPs that were not prioritized by the linear models but are known to be associated with brain-related disorders.
arXiv Detail & Related papers (2022-03-31T20:15:22Z) - MAGIC: Multi-scale Heterogeneity Analysis and Clustering for Brain
Diseases [3.955454029331185]
We introduce a novel method, MAGIC, to uncover disease heterogeneity by leveraging multi-scale clustering.
We validate MAGIC using simulated heterogeneous neuroanatomical data and demonstrate its clinical potential by exploring the heterogeneity of Alzheimers Disease (AD)
Our results indicate two main subtypes of AD with distinct atrophy patterns that consist of both fine-scale atrophy in the hippocampus as well as large-scale atrophy in cortical regions.
arXiv Detail & Related papers (2020-07-01T23:42:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.