gFlora: a topology-aware method to discover functional co-response groups in soil microbial communities
- URL: http://arxiv.org/abs/2407.03897v2
- Date: Wed, 17 Jul 2024 13:10:20 GMT
- Title: gFlora: a topology-aware method to discover functional co-response groups in soil microbial communities
- Authors: Nan Chen, Merlijn Schram, Doina Bucur,
- Abstract summary: We aim to learn the functional co-response group: a group of taxa whose co-response effect shows the total topological abundance of taxa.
We model the soil microbial community as an ecological co-occurrence network with the taxa as nodes.
We design a method called gFlora which notably uses graph convolution over this co-occurrence network to get the co-response effect of the group.
- Score: 2.6884929428864353
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We aim to learn the functional co-response group: a group of taxa whose co-response effect (the representative characteristic of the group showing the total topological abundance of taxa) co-responds (associates well statistically) to a functional variable. Different from the state-of-the-art method, we model the soil microbial community as an ecological co-occurrence network with the taxa as nodes (weighted by their abundance) and their relationships (a combination from both spatial and functional ecological aspects) as edges (weighted by the strength of the relationships). Then, we design a method called gFlora which notably uses graph convolution over this co-occurrence network to get the co-response effect of the group, such that the network topology is also considered in the discovery process. We evaluate gFlora on two real-world soil microbiome datasets (bacteria and nematodes) and compare it with the state-of-the-art method. gFlora outperforms this on all evaluation metrics, and discovers new functional evidence for taxa which were so far under-studied. We show that the graph convolution step is crucial to taxa with relatively low abundance (thus removing the bias towards taxa with higher abundance), and the discovered bacteria of different genera are distributed in the co-occurrence network but still tightly connected among themselves, demonstrating that topologically they fill different but collaborative functional roles in the ecological community.
Related papers
- Understanding the Effect of GCN Convolutions in Regression Tasks [8.299692647308323]
Graph Convolutional Networks (GCNs) have become a pivotal method in machine learning for modeling functions over graphs.
This paper provides a formal analysis of the impact of convolution operators on regression tasks over homophilic networks.
arXiv Detail & Related papers (2024-10-26T04:19:52Z) - UMMAN: Unsupervised Multi-graph Merge Adversarial Network for Disease Prediction Based on Intestinal Flora [0.18641315013048299]
We present a novel architecture, Unsupervised Multi-graph Adversarial Network (UMMAN)
UMMAN can obtain the embeddings of nodes in the Multi-Graph in an unsupervised scenario, so that it helps learn the multiplex association.
We employ complex relation-types to construct the Original-Graph and disrupt the relationships among nodes to generate corresponding Shuffled-Graph.
arXiv Detail & Related papers (2024-07-31T16:06:43Z) - Exploiting Hierarchical Interactions for Protein Surface Learning [52.10066114039307]
Intrinsically, potential function sites in protein surfaces are determined by both geometric and chemical features.
In this paper, we present a principled framework based on deep learning techniques, namely Hierarchical Chemical and Geometric Feature Interaction Network (HCGNet)
Our method outperforms the prior state-of-the-art method by 2.3% in site prediction task and 3.2% in interaction matching task.
arXiv Detail & Related papers (2024-01-17T14:10:40Z) - Graph-level Protein Representation Learning by Structure Knowledge
Refinement [50.775264276189695]
This paper focuses on learning representation on the whole graph level in an unsupervised manner.
We propose a novel framework called Structure Knowledge Refinement (SKR) which uses data structure to determine the probability of whether a pair is positive or negative.
arXiv Detail & Related papers (2024-01-05T09:05:33Z) - On Discprecncies between Perturbation Evaluations of Graph Neural
Network Attributions [49.8110352174327]
We assess attribution methods from a perspective not previously explored in the graph domain: retraining.
The core idea is to retrain the network on important (or not important) relationships as identified by the attributions.
We run our analysis on four state-of-the-art GNN attribution methods and five synthetic and real-world graph classification datasets.
arXiv Detail & Related papers (2024-01-01T02:03:35Z) - PhyloGFN: Phylogenetic inference with generative flow networks [57.104166650526416]
We introduce the framework of generative flow networks (GFlowNets) to tackle two core problems in phylogenetics: parsimony-based and phylogenetic inference.
Because GFlowNets are well-suited for sampling complex structures, they are a natural choice for exploring and sampling from the multimodal posterior distribution over tree topologies.
We demonstrate that our amortized posterior sampler, PhyloGFN, produces diverse and high-quality evolutionary hypotheses on real benchmark datasets.
arXiv Detail & Related papers (2023-10-12T23:46:08Z) - Cross-Validation for Training and Testing Co-occurrence Network
Inference Algorithms [1.8638865257327277]
Co-occurrence network inference algorithms help us understand the complex associations of micro-organisms, especially bacteria.
Previous methods for evaluating the quality of the inferred network include using external data, and network consistency across sub-samples.
We propose a novel cross-validation method to evaluate co-occurrence network inference algorithms, and new methods for applying existing algorithms to predict on test data.
arXiv Detail & Related papers (2023-09-26T19:43:15Z) - Graph Neural Networks for Microbial Genome Recovery [64.91162205624848]
We propose to use Graph Neural Networks (GNNs) to leverage the assembly graph when learning contig representations for metagenomic binning.
Our method, VaeG-Bin, combines variational autoencoders for learning latent representations of the individual contigs, with GNNs for refining these representations by taking into account the neighborhood structure of the contigs in the assembly graph.
arXiv Detail & Related papers (2022-04-26T12:49:51Z) - RepBin: Constraint-based Graph Representation Learning for Metagenomic
Binning [12.561034842067889]
We present a new formulation using a graph where the nodes are subsequences and edges represent homophily information.
We develop new algorithms for (i) graph representation learning that preserves both homophily relations and heterophily constraints.
Our approach, called RepBin, outperforms a wide variety of competing methods.
arXiv Detail & Related papers (2021-12-22T07:01:01Z) - PhD dissertation to infer multiple networks from microbial data [0.0]
A microbial network is a weighted graph that is constructed from a sample-taxa count matrix.
The nodes in this graph represent microbial taxa and the edges represent pairwise associations amongst these taxa.
arXiv Detail & Related papers (2020-10-12T19:16:26Z) - Understanding Negative Sampling in Graph Representation Learning [87.35038268508414]
We show that negative sampling is as important as positive sampling in determining the optimization objective and the resulted variance.
We propose Metropolis-Hastings (MCNS) to approximate the positive distribution with self-contrast approximation and accelerate negative sampling by Metropolis-Hastings.
We evaluate our method on 5 datasets that cover extensive downstream graph learning tasks, including link prediction, node classification and personalized recommendation.
arXiv Detail & Related papers (2020-05-20T06:25:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.