A deep learning classifier for local ancestry inference
- URL: http://arxiv.org/abs/2011.02081v1
- Date: Wed, 4 Nov 2020 00:42:01 GMT
- Title: A deep learning classifier for local ancestry inference
- Authors: Matthew Aguirre, Jan Sokol, Guhan Venkataraman, Alexander Ioannidis
- Abstract summary: Local ancestry inference identifies the ancestry of each segment of an individual's genome.
We develop a new LAI tool using a deep convolutional neural network with an encoder-decoder architecture.
We show that our model is able to learn admixture as a zero-shot task, yielding ancestry assignments that are nearly as accurate as those from the existing gold standard tool, RFMix.
- Score: 63.8376359764052
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Local ancestry inference (LAI) identifies the ancestry of each segment of an
individual's genome and is an important step in medical and population genetic
studies of diverse cohorts. Several techniques have been used for LAI,
including Hidden Markov Models and Random Forests. Here, we formulate the LAI
task as an image segmentation problem and develop a new LAI tool using a deep
convolutional neural network with an encoder-decoder architecture. We train our
model using complete genome sequences from 982 unadmixed individuals from each
of five continental ancestry groups, and we evaluate it using simulated admixed
data derived from an additional 279 individuals selected from the same
populations. We show that our model is able to learn admixture as a zero-shot
task, yielding ancestry assignments that are nearly as accurate as those from
the existing gold standard tool, RFMix.
Related papers
- Gene Teams are on the Field: Evaluation of Variants in Gene-Networks
Using High Dimensional Modelling [0.0]
In medical genetics, each genetic variant is evaluated as an independent entity regarding its clinical importance.
In most complex diseases, variant combinations in specific gene networks, rather than the presence of a particular single variant, predominates.
We propose a high dimensional modelling based method to analyse all the variants in a gene network together.
arXiv Detail & Related papers (2023-01-27T15:02:23Z) - Unsupervised Cross-Domain Feature Extraction for Single Blood Cell Image
Classification [37.90158669637884]
Autoencoder is based on an R-CNN architecture allowing it to focus on the relevant white blood cell and eliminate artifacts in the image.
We show that thanks to the rich features extracted by the autoencoder trained on only one of the datasets, the random forest classifier performs satisfactorily on the unseen datasets.
Our results suggest the possibility of employing this unsupervised approach in more complicated diagnosis and prognosis tasks without the need to add expensive expert labels to unseen data.
arXiv Detail & Related papers (2022-07-01T15:44:42Z) - Learning to Untangle Genome Assembly with Graph Convolutional Networks [17.227634756670835]
We introduce a new learning framework to train a graph convolutional network to resolve assembly graphs by finding a correct path through them.
Experimental results show that a model, trained on simulated graphs generated solely from a single chromosome, is able to remarkably resolve all other chromosomes.
arXiv Detail & Related papers (2022-06-01T04:14:25Z) - ContIG: Self-supervised Multimodal Contrastive Learning for Medical
Imaging with Genetics [4.907551775445731]
We propose ContIG, a self-supervised method that can learn from large datasets of unlabeled medical images and genetic data.
Our approach aligns images and several genetic modalities in the feature space using a contrastive loss.
We also perform genome-wide association studies on the features learned by our models, uncovering interesting relationships between images and genetic data.
arXiv Detail & Related papers (2021-11-26T11:06:12Z) - Deep metric learning improves lab of origin prediction of genetically
engineered plasmids [63.05016513788047]
Genetic engineering attribution (GEA) is the ability to make sequence-lab associations.
We propose a method, based on metric learning, that ranks the most likely labs-of-origin.
We are able to extract key signatures in plasmid sequences for particular labs, allowing for an interpretable examination of the model's outputs.
arXiv Detail & Related papers (2021-11-24T16:29:03Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z) - Comparisons among different stochastic selection of activation layers
for convolutional neural networks for healthcare [77.99636165307996]
We classify biomedical images using ensembles of neural networks.
We select our activations among the following ones: ReLU, leaky ReLU, Parametric ReLU, ELU, Adaptive Piecewice Linear Unit, S-Shaped ReLU, Swish, Mish, Mexican Linear Unit, Parametric Deformable Linear Unit, Soft Root Sign.
arXiv Detail & Related papers (2020-11-24T01:53:39Z) - Mycorrhiza: Genotype Assignment usingPhylogenetic Networks [2.286041284499166]
We introduce Mycorrhiza, a machine learning approach for the genotype assignment problem.
Our algorithm makes use of phylogenetic networks to engineer features that encode the evolutionary relationships among samples.
Mycorrhiza yields particularly significant gains on datasets with a large average fixation index (FST) or deviation from the Hardy-Weinberg equilibrium.
arXiv Detail & Related papers (2020-10-14T02:36:27Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - MS-Net: Multi-Site Network for Improving Prostate Segmentation with
Heterogeneous MRI Data [75.73881040581767]
We propose a novel multi-site network (MS-Net) for improving prostate segmentation by learning robust representations.
Our MS-Net improves the performance across all datasets consistently, and outperforms state-of-the-art methods for multi-site learning.
arXiv Detail & Related papers (2020-02-09T14:11:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.