Learning complex dependency structure of gene regulatory networks from
high dimensional micro-array data with Gaussian Bayesian networks
- URL: http://arxiv.org/abs/2106.15365v1
- Date: Mon, 28 Jun 2021 15:04:35 GMT
- Title: Learning complex dependency structure of gene regulatory networks from
high dimensional micro-array data with Gaussian Bayesian networks
- Authors: Catharina Elisabeth Graafland and Jos\'e Manuel Guti\'errez
- Abstract summary: Gene expression datasets consist of thousand of genes with relatively small samplesizes.
Glasso algorithm has been proposed to deal with high dimensional micro-array datasets forcing sparsity.
modifications of the default Glasso algorithm are developed to overcome the problem of complex interaction structure.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Gene expression datasets consist of thousand of genes with relatively small
samplesizes (i.e. are large-$p$-small-$n$). Moreover, dependencies of various
orders co-exist in the datasets. In the Undirected probabilistic Graphical
Model (UGM) framework the Glasso algorithm has been proposed to deal with high
dimensional micro-array datasets forcing sparsity. Also, modifications of the
default Glasso algorithm are developed to overcome the problem of complex
interaction structure. In this work we advocate the use of a simple score-based
Hill Climbing algorithm (HC) that learns Gaussian Bayesian Networks (BNs)
leaning on Directed Acyclic Graphs (DAGs). We compare HC with Glasso and its
modifications in the UGM framework on their capability to reconstruct GRNs from
micro-array data belonging to the Escherichia Coli genome. We benefit from the
analytical properties of the Joint Probability Density (JPD) function on which
both directed and undirected PGMs build to convert DAGs to UGMs.
We conclude that dependencies in complex data are learned best by the HC
algorithm, presenting them most accurately and efficiently, simultaneously
modelling strong local and weaker but significant global connections coexisting
in the gene expression dataset. The HC algorithm adapts intrinsically to the
complex dependency structure of the dataset, without forcing a specific
structure in advance. On the contrary, Glasso and modifications model
unnecessary dependencies at the expense of the probabilistic information in the
network and of a structural bias in the JPD function that can only be relieved
including many parameters.
Related papers
- Semantically Rich Local Dataset Generation for Explainable AI in Genomics [0.716879432974126]
Black box deep learning models trained on genomic sequences excel at predicting the outcomes of different gene regulatory mechanisms.
We propose using Genetic Programming to generate datasets by evolving perturbations in sequences that contribute to their semantic diversity.
arXiv Detail & Related papers (2024-07-03T10:31:30Z) - Coordinated Multi-Neighborhood Learning on a Directed Acyclic Graph [6.727984016678534]
Learning the structure of causal directed acyclic graphs (DAGs) is useful in many areas of machine learning and artificial intelligence.
It is challenging to obtain good empirical and theoretical results without strong and often restrictive assumptions.
This paper develops a new constraint-based method for estimating the local structure around multiple user-specified target nodes.
arXiv Detail & Related papers (2024-05-24T08:49:43Z) - Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein [56.62376364594194]
Unsupervised learning aims to capture the underlying structure of potentially large and high-dimensional datasets.
In this work, we revisit these approaches under the lens of optimal transport and exhibit relationships with the Gromov-Wasserstein problem.
This unveils a new general framework, called distributional reduction, that recovers DR and clustering as special cases and allows addressing them jointly within a single optimization problem.
arXiv Detail & Related papers (2024-02-03T19:00:19Z) - RGM: A Robust Generalizable Matching Model [49.60975442871967]
We propose a deep model for sparse and dense matching, termed RGM (Robust Generalist Matching)
To narrow the gap between synthetic training samples and real-world scenarios, we build a new, large-scale dataset with sparse correspondence ground truth.
We are able to mix up various dense and sparse matching datasets, significantly improving the training diversity.
arXiv Detail & Related papers (2023-10-18T07:30:08Z) - Higher Order Gauge Equivariant CNNs on Riemannian Manifolds and
Applications [7.322121417864824]
We introduce a higher order generalization of the gauge equivariant convolution, dubbed a gauge equivariant Volterra network (GEVNet)
This allows us to model spatially extended nonlinear interactions within a given field while still maintaining equivariance to global isometries.
In the neuroimaging data experiments, the resulting two-part architecture is used to automatically discriminate between patients with Lewy Body Disease (DLB), Alzheimer's Disease (AD) and Parkinson's Disease (PD) from diffusion magnetic resonance images (dMRI)
arXiv Detail & Related papers (2023-05-26T06:02:31Z) - Simple and Efficient Heterogeneous Graph Neural Network [55.56564522532328]
Heterogeneous graph neural networks (HGNNs) have powerful capability to embed rich structural and semantic information of a heterogeneous graph into node representations.
Existing HGNNs inherit many mechanisms from graph neural networks (GNNs) over homogeneous graphs, especially the attention mechanism and the multi-layer structure.
This paper conducts an in-depth and detailed study of these mechanisms and proposes Simple and Efficient Heterogeneous Graph Neural Network (SeHGNN)
arXiv Detail & Related papers (2022-07-06T10:01:46Z) - BCDAG: An R package for Bayesian structure and Causal learning of
Gaussian DAGs [77.34726150561087]
We introduce the R package for causal discovery and causal effect estimation from observational data.
Our implementation scales efficiently with the number of observations and, whenever the DAGs are sufficiently sparse, the number of variables in the dataset.
We then illustrate the main functions and algorithms on both real and simulated datasets.
arXiv Detail & Related papers (2022-01-28T09:30:32Z) - GenURL: A General Framework for Unsupervised Representation Learning [58.59752389815001]
Unsupervised representation learning (URL) learns compact embeddings of high-dimensional data without supervision.
We propose a unified similarity-based URL framework, GenURL, which can smoothly adapt to various URL tasks.
Experiments demonstrate that GenURL achieves consistent state-of-the-art performance in self-supervised visual learning, unsupervised knowledge distillation (KD), graph embeddings (GE), and dimension reduction.
arXiv Detail & Related papers (2021-10-27T16:24:39Z) - Scalable Gaussian Processes for Data-Driven Design using Big Data with
Categorical Factors [14.337297795182181]
Gaussian processes (GP) have difficulties in accommodating big datasets, categorical inputs, and multiple responses.
We propose a GP model that utilizes latent variables and functions obtained through variational inference to address the aforementioned challenges simultaneously.
Our approach is demonstrated for machine learning of ternary oxide materials and topology optimization of a multiscale compliant mechanism.
arXiv Detail & Related papers (2021-06-26T02:17:23Z) - Multidimensional Scaling for Gene Sequence Data with Autoencoders [0.0]
We present an autoencoder-based dimensional reduction model which can easily scale to datasets containing millions of gene sequences.
The proposed model is evaluated against DAMDS with a real world fungi gene sequence dataset.
arXiv Detail & Related papers (2021-04-19T02:14:17Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.