Inflammatory Bowel Disease Biomarkers of Human Gut Microbiota Selected
via Ensemble Feature Selection Methods
- URL: http://arxiv.org/abs/2001.03019v1
- Date: Wed, 8 Jan 2020 13:17:26 GMT
- Title: Inflammatory Bowel Disease Biomarkers of Human Gut Microbiota Selected
via Ensemble Feature Selection Methods
- Authors: Hilal Hacilar, O.Ufuk Nalbantoglu, Oya Aran, Burcu Bakir-Gungor
- Abstract summary: Inflammatory Bowel Diseases (IBD), diabetes, and cancer can cause several diseases such as Inflammatory Bowel Diseases (IBD), diabetes, and cancer.
IBD, is a gut related disorder where the deviations from the healthy gut microbiome are considered to be associated with IBD.
This study utilizes both supervised and unsupervised machine learning algorithms to generate a classification model that aids IBD diagnosis.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The tremendous boost in the next generation sequencing and in the omics
technologies makes it possible to characterize human gut microbiome (the
collective genomes of the microbial community that reside in our
gastrointestinal tract). While some of these microorganisms are considered as
essential regulators of our immune system, some others can cause several
diseases such as Inflammatory Bowel Diseases (IBD), diabetes, and cancer. IBD,
is a gut related disorder where the deviations from the healthy gut microbiome
are considered to be associated with IBD. Although existing studies attempt to
unveal the composition of the gut microbiome in relation to IBD diseases, a
comprehensive picture is far from being complete. Due to the complexity of
metagenomic studies, the applications of the state of the art machine learning
techniques became popular to address a wide range of questions in the field of
metagenomic data analysis. In this regard, using IBD associated metagenomics
dataset, this study utilizes both supervised and unsupervised machine learning
algorithms, i) to generate a classification model that aids IBD diagnosis, ii)
to discover IBD associated biomarkers, iii) to find subgroups of IBD patients
using k means and hierarchical clustering. To deal with the high dimensionality
of features, we applied robust feature selection algorithms such as Conditional
Mutual Information Maximization (CMIM), Fast Correlation Based Filter (FCBF),
min redundancy max relevance (mRMR) and Extreme Gradient Boosting (XGBoost). In
our experiments with 10 fold cross validation, XGBoost had a considerable
effect in terms of minimizing the microbiota used for the diagnosis of IBD and
thus reducing the cost and time. We observed that compared to the single
classifiers, ensemble methods such as kNN and logitboost resulted in better
performance measures for the classification of IBD.
Related papers
- PathoLM: Identifying pathogenicity from the DNA sequence through the Genome Foundation Model [9.285895422810704]
PathoLM is a cutting-edge pathogen language model optimized for the identification of pathogenicity in bacterial and viral sequences.
We developed a comprehensive data set comprising approximately 30 species of viruses and bacteria, including ESKAPEE pathogens.
In comparative assessments, PathoLM dramatically outperforms existing models like DciPatho, demonstrating robust zero-shot and few-shot capabilities.
arXiv Detail & Related papers (2024-06-19T00:53:48Z) - MMIL: A novel algorithm for disease associated cell type discovery [58.044870442206914]
Single-cell datasets often lack individual cell labels, making it challenging to identify cells associated with disease.
We introduce Mixture Modeling for Multiple Learning Instance (MMIL), an expectation method that enables the training and calibration of cell-level classifiers.
arXiv Detail & Related papers (2024-06-12T15:22:56Z) - Exploring Gene Regulatory Interaction Networks and predicting
therapeutic molecules for Hypopharyngeal Cancer and EGFR-mutated lung
adenocarcinoma [5.178086150698542]
In this study, we select EGFR-mutated lung adenocarcinoma and Hypopharyngeal cancer by finding the metastases in hypopharyngeal cancer.
Our research findings have suggested common therapeutic molecules for the selected diseases based on 10 hub genes.
arXiv Detail & Related papers (2024-02-27T11:29:36Z) - Single-Cell Deep Clustering Method Assisted by Exogenous Gene
Information: A Novel Approach to Identifying Cell Types [50.55583697209676]
We develop an attention-enhanced graph autoencoder, which is designed to efficiently capture the topological features between cells.
During the clustering process, we integrated both sets of information and reconstructed the features of both cells and genes to generate a discriminative representation.
This research offers enhanced insights into the characteristics and distribution of cells, thereby laying the groundwork for early diagnosis and treatment of diseases.
arXiv Detail & Related papers (2023-11-28T09:14:55Z) - Application of data engineering approaches to address challenges in
microbiome data for optimal medical decision-making [0.0]
The study addresses the issues inherent to microbiome datasets and could be highly beneficial for providing personalized medicine.
The prototype employed in the study addresses the issues inherent to microbiome datasets and could be highly beneficial for providing personalized medicine.
arXiv Detail & Related papers (2023-06-30T05:36:39Z) - Functional Integrative Bayesian Analysis of High-dimensional
Multiplatform Genomic Data [0.8029049649310213]
We propose a framework called Functional Integrative Bayesian Analysis of High-dimensional Multiplatform Genomic Data (fiBAG)
fiBAG allows simultaneous identification of upstream functional evidence of proteogenomic biomarkers.
We demonstrate the profitability of fiBAG via a pan-cancer analysis of 14 cancer types.
arXiv Detail & Related papers (2022-12-29T03:31:45Z) - Deep neural networks approach to microbial colony detection -- a
comparative analysis [52.77024349608834]
This study investigates the performance of three deep learning approaches for object detection on the AGAR dataset.
The achieved results may serve as a benchmark for future experiments.
arXiv Detail & Related papers (2021-08-23T12:06:00Z) - Cancer Gene Profiling through Unsupervised Discovery [49.28556294619424]
We introduce a novel, automatic and unsupervised framework to discover low-dimensional gene biomarkers.
Our method is based on the LP-Stability algorithm, a high dimensional center-based unsupervised clustering algorithm.
Our signature reports promising results on distinguishing immune inflammatory and immune desert tumors.
arXiv Detail & Related papers (2021-02-11T09:04:45Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z) - Neural networks for Anatomical Therapeutic Chemical (ATC) [83.73971067918333]
We propose combining multiple multi-label classifiers trained on distinct sets of features, including sets extracted from a Bidirectional Long Short-Term Memory Network (BiLSTM)
Experiments demonstrate the power of this approach, which is shown to outperform the best methods reported in the literature.
arXiv Detail & Related papers (2021-01-22T19:49:47Z) - The scalable Birth-Death MCMC Algorithm for Mixed Graphical Model
Learning with Application to Genomic Data Integration [0.0]
We propose a novel mixed graphical model approach to analyze multi-omic data of different types.
We find that our method is superior in terms of both computational efficiency and the accuracy of the model selection results.
arXiv Detail & Related papers (2020-05-08T16:34:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.