Tackling the dimensions in imaging genetics with CLUB-PLS
- URL: http://arxiv.org/abs/2309.07352v2
- Date: Wed, 20 Sep 2023 01:45:56 GMT
- Title: Tackling the dimensions in imaging genetics with CLUB-PLS
- Authors: Andre Altmann, Ana C Lawry Aguila, Neda Jahanshad, Paul M Thompson,
Marco Lorenzi
- Abstract summary: Cluster-Bootstrap PLS (CLUB-PLS) provides robust statistics for single input features in both domains.
CLUB-PLS investigated the genetic basis of surface area and cortical thickness in a sample of 33,000 subjects from the UK Biobank.
- Score: 4.829285448503734
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A major challenge in imaging genetics and similar fields is to link
high-dimensional data in one domain, e.g., genetic data, to high dimensional
data in a second domain, e.g., brain imaging data. The standard approach in the
area are mass univariate analyses across genetic factors and imaging
phenotypes. That entails executing one genome-wide association study (GWAS) for
each pre-defined imaging measure. Although this approach has been tremendously
successful, one shortcoming is that phenotypes must be pre-defined.
Consequently, effects that are not confined to pre-selected regions of interest
or that reflect larger brain-wide patterns can easily be missed. In this work
we introduce a Partial Least Squares (PLS)-based framework, which we term
Cluster-Bootstrap PLS (CLUB-PLS), that can work with large input dimensions in
both domains as well as with large sample sizes. One key factor of the
framework is to use cluster bootstrap to provide robust statistics for single
input features in both domains. We applied CLUB-PLS to investigating the
genetic basis of surface area and cortical thickness in a sample of 33,000
subjects from the UK Biobank. We found 107 genome-wide significant
locus-phenotype pairs that are linked to 386 different genes. We found that a
vast majority of these loci could be technically validated at a high rate:
using classic GWAS or Genome-Wide Inferred Statistics (GWIS) we found that 85
locus-phenotype pairs exceeded the genome-wide suggestive (P<1e-05) threshold.
Related papers
- Genetic InfoMax: Exploring Mutual Information Maximization in
High-Dimensional Imaging Genetics Studies [50.11449968854487]
Genome-wide association studies (GWAS) are used to identify relationships between genetic variations and specific traits.
Representation learning for imaging genetics is largely under-explored due to the unique challenges posed by GWAS.
We introduce a trans-modal learning framework Genetic InfoMax (GIM) to address the specific challenges of GWAS.
arXiv Detail & Related papers (2023-09-26T03:59:21Z) - Deep Visual-Genetic Biometrics for Taxonomic Classification of Rare
Species [1.9819034119774483]
We propose aligned visual-genetic inference spaces with the aim to implicitly encode cross-domain associations for improved performance.
We experimentally demonstrate the efficacy of the concept via application to microscopic imagery of 30k+ planktic foraminifer shells.
Visual-genetic alignment can significantly benefit visual-only recognition of the rarest species.
arXiv Detail & Related papers (2023-05-11T10:04:27Z) - Gene Teams are on the Field: Evaluation of Variants in Gene-Networks
Using High Dimensional Modelling [0.0]
In medical genetics, each genetic variant is evaluated as an independent entity regarding its clinical importance.
In most complex diseases, variant combinations in specific gene networks, rather than the presence of a particular single variant, predominates.
We propose a high dimensional modelling based method to analyse all the variants in a gene network together.
arXiv Detail & Related papers (2023-01-27T15:02:23Z) - Unsupervised ensemble-based phenotyping helps enhance the
discoverability of genes related to heart morphology [57.25098075813054]
We propose a new framework for gene discovery entitled Un Phenotype Ensembles.
It builds a redundant yet highly expressive representation by pooling a set of phenotypes learned in an unsupervised manner.
These phenotypes are then analyzed via (GWAS), retaining only highly confident and stable associations.
arXiv Detail & Related papers (2023-01-07T18:36:44Z) - Stacking Ensemble Learning in Deep Domain Adaptation for Ophthalmic
Image Classification [61.656149405657246]
Domain adaptation is effective in image classification tasks where obtaining sufficient label data is challenging.
We propose a novel method, named SELDA, for stacking ensemble learning via extending three domain adaptation methods.
The experimental results using Age-Related Eye Disease Study (AREDS) benchmark ophthalmic dataset demonstrate the effectiveness of the proposed model.
arXiv Detail & Related papers (2022-09-27T14:19:00Z) - rfPhen2Gen: A machine learning based association study of brain imaging
phenotypes to genotypes [71.1144397510333]
We learned machine learning models to predict SNPs using 56 brain imaging QTs.
SNPs within the known Alzheimer disease (AD) risk gene APOE had lowest RMSE for lasso and random forest.
Random forests identified additional SNPs that were not prioritized by the linear models but are known to be associated with brain-related disorders.
arXiv Detail & Related papers (2022-03-31T20:15:22Z) - Multi-modal Self-supervised Pre-training for Regulatory Genome Across
Cell Types [75.65676405302105]
We propose a simple yet effective approach for pre-training genome data in a multi-modal and self-supervised manner, which we call GeneBERT.
We pre-train our model on the ATAC-seq dataset with 17 million genome sequences.
arXiv Detail & Related papers (2021-10-11T12:48:44Z) - A deep learning classifier for local ancestry inference [63.8376359764052]
Local ancestry inference identifies the ancestry of each segment of an individual's genome.
We develop a new LAI tool using a deep convolutional neural network with an encoder-decoder architecture.
We show that our model is able to learn admixture as a zero-shot task, yielding ancestry assignments that are nearly as accurate as those from the existing gold standard tool, RFMix.
arXiv Detail & Related papers (2020-11-04T00:42:01Z) - Mycorrhiza: Genotype Assignment usingPhylogenetic Networks [2.286041284499166]
We introduce Mycorrhiza, a machine learning approach for the genotype assignment problem.
Our algorithm makes use of phylogenetic networks to engineer features that encode the evolutionary relationships among samples.
Mycorrhiza yields particularly significant gains on datasets with a large average fixation index (FST) or deviation from the Hardy-Weinberg equilibrium.
arXiv Detail & Related papers (2020-10-14T02:36:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.