Related papers: Gene-SGAN: a method for discovering disease subtypes with imaging and genetic signatures via multi-view weakly-supervised deep clustering

Gene-SGAN: a method for discovering disease subtypes with imaging and genetic signatures via multi-view weakly-supervised deep clustering

URL: http://arxiv.org/abs/2301.10772v1
Date: Wed, 25 Jan 2023 10:08:30 GMT
Title: Gene-SGAN: a method for discovering disease subtypes with imaging and genetic signatures via multi-view weakly-supervised deep clustering
Authors: Zhijian Yang, Junhao Wen, Ahmed Abdulkadir, Yuhan Cui, Guray Erus, Elizabeth Mamourian, Randa Melhem, Dhivya Srinivasan, Sindhuja T. Govindarajan, Jiong Chen, Mohamad Habes, Colin L. Masters, Paul Maruff, Jurgen Fripp, Luigi Ferrucci, Marilyn S. Albert, Sterling C. Johnson, John C. Morris, Pamela LaMontagne, Daniel S. Marcus, Tammie L. S. Benzinger, David A. Wolk, Li Shen, Jingxuan Bao, Susan M. Resnick, Haochang Shou, Ilya M. Nasrallah, Christos Davatzikos
Abstract summary: Gene-SGAN is a multi-view, weakly-supervised deep clustering method. It dissects disease heterogeneity by jointly considering phenotypic and genetic data. Gene-SGAN is broadly applicable to disease subtyping and endophenotype discovery.
Score: 6.79528256151419
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Disease heterogeneity has been a critical challenge for precision diagnosis and treatment, especially in neurologic and neuropsychiatric diseases. Many diseases can display multiple distinct brain phenotypes across individuals, potentially reflecting disease subtypes that can be captured using MRI and machine learning methods. However, biological interpretability and treatment relevance are limited if the derived subtypes are not associated with genetic drivers or susceptibility factors. Herein, we describe Gene-SGAN - a multi-view, weakly-supervised deep clustering method - which dissects disease heterogeneity by jointly considering phenotypic and genetic data, thereby conferring genetic correlations to the disease subtypes and associated endophenotypic signatures. We first validate the generalizability, interpretability, and robustness of Gene-SGAN in semi-synthetic experiments. We then demonstrate its application to real multi-site datasets from 28,858 individuals, deriving subtypes of Alzheimer's disease and brain endophenotypes associated with hypertension, from MRI and SNP data. Derived brain phenotypes displayed significant differences in neuroanatomical patterns, genetic determinants, biological and clinical biomarkers, indicating potentially distinct underlying neuropathologic processes, genetic drivers, and susceptibility factors. Overall, Gene-SGAN is broadly applicable to disease subtyping and endophenotype discovery, and is herein tested on disease-related, genetically-driven neuroimaging phenotypes.

Related papers

GRAPE: Heterogeneous Graph Representation Learning for Genetic Perturbation with Coding and Non-Coding Biotype [51.58774936662233]
Building gene regulatory networks (GRN) is essential to understand and predict the effects of genetic perturbations.<n>In this work, we leverage pre-trained large language model and DNA sequence model to extract features from gene descriptions and DNA sequence data.<n>We introduce gene biotype information for the first time in genetic perturbation, simulating the distinct roles of genes with different biotypes in regulating cellular processes.
arXiv Detail & Related papers (2025-05-06T03:35:24Z)
G2PDiffusion: Cross-Species Genotype-to-Phenotype Prediction via Evolutionary Diffusion [108.94237816552024]
We propose the first genotype-to-phenotype diffusion model (G2PDiffusion) that generates morphological images from DNA. The model contains three novel components: 1) a MSA retrieval engine that identifies conserved and co-evolutionary patterns; 2) an environment-aware MSA conditional encoder that effectively models complex genotype-environment interactions; and 3) an adaptive phenomic alignment module to improve genotype-phenotype consistency.
arXiv Detail & Related papers (2025-02-07T06:16:31Z)
Survey and Improvement Strategies for Gene Prioritization with Large Language Models [61.24568051916653]
Large language models (LLMs) have performed well in medical exams, but their effectiveness in diagnosing rare genetic diseases has not been assessed. We used multi-agent and Human Phenotype Ontology (HPO) classification to categorized patients based on phenotypes and solvability levels. At baseline, GPT-4 outperformed other LLMs, achieving near 30% accuracy in ranking causal genes correctly.
arXiv Detail & Related papers (2025-01-30T23:03:03Z)
Identifying latent disease factors differently expressed in patient subgroups using group factor analysis [54.67330718129736]
We propose a novel approach to uncover subgroup-specific and subgroup-common latent factors. The proposed approach, sparse Group Factor Analysis (GFA) with regularised horseshoe priors, was implemented with probabilistic programming.
arXiv Detail & Related papers (2024-10-10T13:12:14Z)
Interpreting artificial neural networks to detect genome-wide association signals for complex traits [0.0]
Investigating the genetic architecture of complex diseases is challenging due to the highly polygenic and interactive landscape of genetic and environmental factors. We trained artificial neural networks for predicting complex traits using both simulated and real genotype/phenotype datasets.
arXiv Detail & Related papers (2024-07-26T15:20:42Z)
Predicting Genetic Mutation from Whole Slide Images via Biomedical-Linguistic Knowledge Enhanced Multi-label Classification [119.13058298388101]
We develop a Biological-knowledge enhanced PathGenomic multi-label Transformer to improve genetic mutation prediction performances. BPGT first establishes a novel gene encoder that constructs gene priors by two carefully designed modules. BPGT then designs a label decoder that finally performs genetic mutation prediction by two tailored modules.
arXiv Detail & Related papers (2024-06-05T06:42:27Z)
Dimensional Neuroimaging Endophenotypes: Neurobiological Representations of Disease Heterogeneity Through Machine Learning [11.653182438505558]
We first present a systematic literature overview of studies using machine learning and multimodal MRI to unravel disease heterogeneity in various neuropsychiatric and neurodegenerative disorders. We then summarize relevant machine learning methodologies and discuss an emerging paradigm which we call dimensional neuroimaging endophenotype (DNE) DNE dissects the neurobiological heterogeneity of neuropsychiatric and neurodegenerative disorders into a low dimensional yet informative, quantitative brain phenotypic representation.
arXiv Detail & Related papers (2024-01-17T16:31:48Z)
GestaltMML: Enhancing Rare Genetic Disease Diagnosis through Multimodal Machine Learning Combining Facial Images and Clinical Texts [8.805728428427457]
We introduce a multimodal machine learning (MML) approach solely based on the Transformer architecture. It integrates facial images, demographic information (age, sex, ethnicity), and clinical notes to improve prediction accuracy.
arXiv Detail & Related papers (2023-12-23T18:40:25Z)
Machine Learning Methods for Cancer Classification Using Gene Expression Data: A Review [77.34726150561087]
Cancer is the second major cause of death after cardiovascular diseases. Gene expression can play a fundamental role in the early detection of cancer. This study reviews recent progress in gene expression analysis for cancer classification using machine learning methods.
arXiv Detail & Related papers (2023-01-28T15:03:03Z)
Unsupervised ensemble-based phenotyping helps enhance the discoverability of genes related to heart morphology [57.25098075813054]
We propose a new framework for gene discovery entitled Un Phenotype Ensembles. It builds a redundant yet highly expressive representation by pooling a set of phenotypes learned in an unsupervised manner. These phenotypes are then analyzed via (GWAS), retaining only highly confident and stable associations.
arXiv Detail & Related papers (2023-01-07T18:36:44Z)
Few-Shot Meta Learning for Recognizing Facial Phenotypes of Genetic Disorders [55.41644538483948]
Automated classification and similarity retrieval aid physicians in decision-making to diagnose possible genetic conditions as early as possible. Previous work has addressed the problem as a classification problem and used deep learning methods. In this study, we used a facial recognition model trained on a large corpus of healthy individuals as a pre-task and transferred it to facial phenotype recognition.
arXiv Detail & Related papers (2022-10-23T11:52:57Z)
Pathology Steered Stratification Network for Subtype Identification in Alzheimer's Disease [7.594681424335177]
Alzheimers disease (AD) is a heterogeneous, multitemporal neurodegenerative disorder characterized by beta-amyloid, pathologic tau, and neurodegeneration. We propose a novel pathology steered stratification network (PSSN) that incorporates established domain knowledge in AD pathology through a reaction-diffusion model.
arXiv Detail & Related papers (2022-10-12T02:52:00Z)
rfPhen2Gen: A machine learning based association study of brain imaging phenotypes to genotypes [71.1144397510333]
We learned machine learning models to predict SNPs using 56 brain imaging QTs. SNPs within the known Alzheimer disease (AD) risk gene APOE had lowest RMSE for lasso and random forest. Random forests identified additional SNPs that were not prioritized by the linear models but are known to be associated with brain-related disorders.
arXiv Detail & Related papers (2022-03-31T20:15:22Z)
MAGIC: Multi-scale Heterogeneity Analysis and Clustering for Brain Diseases [3.955454029331185]
We introduce a novel method, MAGIC, to uncover disease heterogeneity by leveraging multi-scale clustering. We validate MAGIC using simulated heterogeneous neuroanatomical data and demonstrate its clinical potential by exploring the heterogeneity of Alzheimers Disease (AD) Our results indicate two main subtypes of AD with distinct atrophy patterns that consist of both fine-scale atrophy in the hippocampus as well as large-scale atrophy in cortical regions.
arXiv Detail & Related papers (2020-07-01T23:42:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.