An Association Test Based on Kernel-Based Neural Networks for Complex
Genetic Association Analysis
- URL: http://arxiv.org/abs/2312.06669v1
- Date: Wed, 6 Dec 2023 05:02:28 GMT
- Title: An Association Test Based on Kernel-Based Neural Networks for Complex
Genetic Association Analysis
- Authors: Tingting Hou, Chang Jiang and Qing Lu
- Abstract summary: We develop a kernel-based neural network model (KNN) that synergizes the strengths of linear mixed models with conventional neural networks.
MINQUE-based test to assess the joint association of genetic variants with the phenotype.
Two additional tests to evaluate and interpret linear and non-linear/non-additive genetic effects.
- Score: 0.8221435109014762
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The advent of artificial intelligence, especially the progress of deep neural
networks, is expected to revolutionize genetic research and offer unprecedented
potential to decode the complex relationships between genetic variants and
disease phenotypes, which could mark a significant step toward improving our
understanding of the disease etiology. While deep neural networks hold great
promise for genetic association analysis, limited research has been focused on
developing neural-network-based tests to dissect complex genotype-phenotype
associations. This complexity arises from the opaque nature of neural networks
and the absence of defined limiting distributions. We have previously developed
a kernel-based neural network model (KNN) that synergizes the strengths of
linear mixed models with conventional neural networks. KNN adopts a
computationally efficient minimum norm quadratic unbiased estimator (MINQUE)
algorithm and uses KNN structure to capture the complex relationship between
large-scale sequencing data and a disease phenotype of interest. In the KNN
framework, we introduce a MINQUE-based test to assess the joint association of
genetic variants with the phenotype, which considers non-linear and
non-additive effects and follows a mixture of chi-square distributions. We also
construct two additional tests to evaluate and interpret linear and
non-linear/non-additive genetic effects, including interaction effects. Our
simulations show that our method consistently controls the type I error rate
under various conditions and achieves greater power than a commonly used
sequence kernel association test (SKAT), especially when involving non-linear
and interaction effects. When applied to real data from the UK Biobank, our
approach identified genes associated with hippocampal volume, which can be
further replicated and evaluated for their role in the pathogenesis of
Alzheimer's disease.
Related papers
- Interpreting artificial neural networks to detect genome-wide association signals for complex traits [0.0]
Investigating the genetic architecture of complex diseases is challenging due to the highly polygenic and interactive landscape of genetic and environmental factors.
We trained artificial neural networks for predicting complex traits using both simulated and real genotype/phenotype datasets.
arXiv Detail & Related papers (2024-07-26T15:20:42Z) - A Kernel-Based Neural Network Test for High-dimensional Sequencing Data
Analysis [0.8221435109014762]
We introduce a new kernel-based neural network (KNN) test for complex association analysis of sequencing data.
Based on KNN, a Wald-type test is then introduced to evaluate the joint association of high-dimensional genetic data with a disease phenotype of interest.
arXiv Detail & Related papers (2023-12-05T16:06:23Z) - PhyloGFN: Phylogenetic inference with generative flow networks [57.104166650526416]
We introduce the framework of generative flow networks (GFlowNets) to tackle two core problems in phylogenetics: parsimony-based and phylogenetic inference.
Because GFlowNets are well-suited for sampling complex structures, they are a natural choice for exploring and sampling from the multimodal posterior distribution over tree topologies.
We demonstrate that our amortized posterior sampler, PhyloGFN, produces diverse and high-quality evolutionary hypotheses on real benchmark datasets.
arXiv Detail & Related papers (2023-10-12T23:46:08Z) - A Sieve Quasi-likelihood Ratio Test for Neural Networks with
Applications to Genetic Association Studies [0.32771631221674324]
We propose a quasi sieve-likelihood ratio test based on NN with one hidden layer for testing complex associations.
The validity of the distribution is investigated via simulations.
We demonstrate the use of the proposed test by performing a genetic association analysis of the sequencing data from Alzheimer's Disease Neuroimaging Initiative (ADNI)
arXiv Detail & Related papers (2022-12-16T02:54:46Z) - Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a
Polynomial Net Study [55.12108376616355]
The study on NTK has been devoted to typical neural network architectures, but is incomplete for neural networks with Hadamard products (NNs-Hp)
In this work, we derive the finite-width-K formulation for a special class of NNs-Hp, i.e., neural networks.
We prove their equivalence to the kernel regression predictor with the associated NTK, which expands the application scope of NTK.
arXiv Detail & Related papers (2022-09-16T06:36:06Z) - rfPhen2Gen: A machine learning based association study of brain imaging
phenotypes to genotypes [71.1144397510333]
We learned machine learning models to predict SNPs using 56 brain imaging QTs.
SNPs within the known Alzheimer disease (AD) risk gene APOE had lowest RMSE for lasso and random forest.
Random forests identified additional SNPs that were not prioritized by the linear models but are known to be associated with brain-related disorders.
arXiv Detail & Related papers (2022-03-31T20:15:22Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Deep neural networks with controlled variable selection for the
identification of putative causal genetic variants [0.43012765978447565]
We propose an interpretable neural network model, stabilized using ensembling, with controlled variable selection for genetic studies.
The merit of the proposed method includes: (1) flexible modelling of the non-linear effect of genetic variants to improve statistical power; (2) multiple knockoffs in the input layer to rigorously control false discovery rate; (3) hierarchical layers to substantially reduce the number of weight parameters and activations to improve computational efficiency.
arXiv Detail & Related papers (2021-09-29T20:57:48Z) - Persistent Homology Captures the Generalization of Neural Networks
Without A Validation Set [0.0]
We suggest studying the training of neural networks with Algebraic Topology, specifically Persistent Homology.
Using simplicial complex representations of neural networks, we study the PH diagram distance evolution on the neural network learning process.
Results show that the PH diagram distance between consecutive neural network states correlates with the validation accuracy.
arXiv Detail & Related papers (2021-05-31T09:17:31Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Measuring Model Complexity of Neural Networks with Curve Activation
Functions [100.98319505253797]
We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function.
We experimentally explore the training process of neural networks and detect overfitting.
We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
arXiv Detail & Related papers (2020-06-16T07:38:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.