Cross-Validation for Training and Testing Co-occurrence Network
Inference Algorithms
- URL: http://arxiv.org/abs/2309.15225v1
- Date: Tue, 26 Sep 2023 19:43:15 GMT
- Title: Cross-Validation for Training and Testing Co-occurrence Network
Inference Algorithms
- Authors: Daniel Agyapong, Jeffrey Ryan Propster, Jane Marks, Toby Dylan Hocking
- Abstract summary: Co-occurrence network inference algorithms help us understand the complex associations of micro-organisms, especially bacteria.
Previous methods for evaluating the quality of the inferred network include using external data, and network consistency across sub-samples.
We propose a novel cross-validation method to evaluate co-occurrence network inference algorithms, and new methods for applying existing algorithms to predict on test data.
- Score: 1.8638865257327277
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Microorganisms are found in almost every environment, including the soil,
water, air, and inside other organisms, like animals and plants. While some
microorganisms cause diseases, most of them help in biological processes such
as decomposition, fermentation and nutrient cycling. A lot of research has gone
into studying microbial communities in various environments and how their
interactions and relationships can provide insights into various diseases.
Co-occurrence network inference algorithms help us understand the complex
associations of micro-organisms, especially bacteria. Existing network
inference algorithms employ techniques such as correlation, regularized linear
regression, and conditional dependence, which have different hyper-parameters
that determine the sparsity of the network. Previous methods for evaluating the
quality of the inferred network include using external data, and network
consistency across sub-samples, both which have several drawbacks that limit
their applicability in real microbiome composition data sets. We propose a
novel cross-validation method to evaluate co-occurrence network inference
algorithms, and new methods for applying existing algorithms to predict on test
data. Our empirical study shows that the proposed method is useful for
hyper-parameter selection (training) and comparing the quality of the inferred
networks between different algorithms (testing).
Related papers
- Hierarchical Sparse Bayesian Multitask Model with Scalable Inference for Microbiome Analysis [1.361248247831476]
This paper proposes a hierarchical Bayesian multitask learning model that is applicable to the general multi-task binary classification learning problem.
We derive a computationally efficient inference algorithm based on variational inference to approximate the posterior distribution.
We demonstrate the potential of the new approach on various synthetic datasets and for predicting human health status based on microbiome profile.
arXiv Detail & Related papers (2025-02-04T18:23:22Z) - Seeing Unseen: Discover Novel Biomedical Concepts via
Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues.
We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space.
A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z) - Multi-Dimensional Ability Diagnosis for Machine Learning Algorithms [88.93372675846123]
We propose a task-agnostic evaluation framework Camilla for evaluating machine learning algorithms.
We use cognitive diagnosis assumptions and neural networks to learn the complex interactions among algorithms, samples and the skills of each sample.
In our experiments, Camilla outperforms state-of-the-art baselines on the metric reliability, rank consistency and rank stability.
arXiv Detail & Related papers (2023-07-14T03:15:56Z) - Three-dimensional microstructure generation using generative adversarial
neural networks in the context of continuum micromechanics [77.34726150561087]
This work proposes a generative adversarial network tailored towards three-dimensional microstructure generation.
The lightweight algorithm is able to learn the underlying properties of the material from a single microCT-scan without the need of explicit descriptors.
arXiv Detail & Related papers (2022-05-31T13:26:51Z) - Convolutional generative adversarial imputation networks for
spatio-temporal missing data in storm surge simulations [86.5302150777089]
Generative Adversarial Imputation Nets (GANs) and GAN-based techniques have attracted attention as unsupervised machine learning methods.
We name our proposed method as Con Conval Generative Adversarial Imputation Nets (Conv-GAIN)
arXiv Detail & Related papers (2021-11-03T03:50:48Z) - Learning Neural Causal Models with Active Interventions [83.44636110899742]
We introduce an active intervention-targeting mechanism which enables a quick identification of the underlying causal structure of the data-generating process.
Our method significantly reduces the required number of interactions compared with random intervention targeting.
We demonstrate superior performance on multiple benchmarks from simulated to real-world data.
arXiv Detail & Related papers (2021-09-06T13:10:37Z) - Deep neural networks approach to microbial colony detection -- a
comparative analysis [52.77024349608834]
This study investigates the performance of three deep learning approaches for object detection on the AGAR dataset.
The achieved results may serve as a benchmark for future experiments.
arXiv Detail & Related papers (2021-08-23T12:06:00Z) - Latent Network Estimation and Variable Selection for Compositional Data
via Variational EM [0.0]
We develop a novel method to simultaneously estimate network interactions and associations.
We show the practical utility of our model via an application to microbiome data.
arXiv Detail & Related papers (2020-10-25T21:52:39Z) - Mycorrhiza: Genotype Assignment usingPhylogenetic Networks [2.286041284499166]
We introduce Mycorrhiza, a machine learning approach for the genotype assignment problem.
Our algorithm makes use of phylogenetic networks to engineer features that encode the evolutionary relationships among samples.
Mycorrhiza yields particularly significant gains on datasets with a large average fixation index (FST) or deviation from the Hardy-Weinberg equilibrium.
arXiv Detail & Related papers (2020-10-14T02:36:27Z) - Identifying efficient controls of complex interaction networks using
genetic algorithms [0.0]
We propose a new solution for a problem known as network controllability.
We tailor our solution for applications in computational drug repurposing.
We show how our algorithm identifies a number of potentially efficient drugs for breast, ovarian, and pancreatic cancer.
arXiv Detail & Related papers (2020-07-09T14:56:54Z) - DCMD: Distance-based Classification Using Mixture Distributions on
Microbiome Data [10.171660468645603]
We present an innovative approach for distance-based classification using mixture distributions (DCMD)
This approach models the inherent uncertainty in sparse counts by estimating a mixture distribution for the sample data.
Results are compared against a number of existing machine learning and distance-based approaches.
arXiv Detail & Related papers (2020-03-29T23:30:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.