MIIDL: a Python package for microbial biomarkers identification powered
by interpretable deep learning
- URL: http://arxiv.org/abs/2109.12204v1
- Date: Fri, 24 Sep 2021 21:30:10 GMT
- Title: MIIDL: a Python package for microbial biomarkers identification powered
by interpretable deep learning
- Authors: Jian Jiang
- Abstract summary: We present MIIDL, a Python package for the identification of microbial biomarkers based on interpretable deep learning.
MIIDL innovatively applies convolutional neural networks, a variety of interpretability algorithms and plenty of pre-processing methods to provide a one-stop and robust pipeline for microbial biomarkers identification from high-dimensional and sparse data sets.
- Score: 5.749346757892117
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Detecting microbial biomarkers used to predict disease phenotypes and
clinical outcomes is crucial for disease early-stage screening and diagnosis.
Most methods for biomarker identification are linear-based, which is very
limited as biological processes are rarely fully linear. The introduction of
machine learning to this field tends to bring a promising solution. However,
identifying microbial biomarkers in an interpretable, data-driven and robust
manner remains challenging. We present MIIDL, a Python package for the
identification of microbial biomarkers based on interpretable deep learning.
MIIDL innovatively applies convolutional neural networks, a variety of
interpretability algorithms and plenty of pre-processing methods to provide a
one-stop and robust pipeline for microbial biomarkers identification from
high-dimensional and sparse data sets.
Related papers
- Revolutionizing Biomarker Discovery: Leveraging Generative AI for Bio-Knowledge-Embedded Continuous Space Exploration [20.419747013569268]
We propose a new biomarker identification framework with two important modules: training data preparation and embedding-optimization-generation.
The first module uses a multi-agent system to automatically collect pairs of biomarker subsets and their corresponding prediction accuracy as training data.
The second module employs an encoder-evaluator-decoder learning paradigm to compress the knowledge of the collected data into a continuous space.
arXiv Detail & Related papers (2024-09-23T23:36:30Z) - Prompting Whole Slide Image Based Genetic Biomarker Prediction [13.764676578911526]
We propose a whole slide image (WSI) based genetic biomarker prediction method via prompting techniques.
We leverage large language models to generate medical prompts that serve as prior knowledge in extracting instances associated with genetic biomarkers.
We adopt a coarse-to-fine approach to mine biomarker information within the tumor microenvironment.
arXiv Detail & Related papers (2024-06-26T11:05:46Z) - MMIL: A novel algorithm for disease associated cell type discovery [58.044870442206914]
Single-cell datasets often lack individual cell labels, making it challenging to identify cells associated with disease.
We introduce Mixture Modeling for Multiple Learning Instance (MMIL), an expectation method that enables the training and calibration of cell-level classifiers.
arXiv Detail & Related papers (2024-06-12T15:22:56Z) - An Evaluation of Large Language Models in Bioinformatics Research [52.100233156012756]
We study the performance of large language models (LLMs) on a wide spectrum of crucial bioinformatics tasks.
These tasks include the identification of potential coding regions, extraction of named entities for genes and proteins, detection of antimicrobial and anti-cancer peptides, molecular optimization, and resolution of educational bioinformatics problems.
Our findings indicate that, given appropriate prompts, LLMs like GPT variants can successfully handle most of these tasks.
arXiv Detail & Related papers (2024-02-21T11:27:31Z) - scBeacon: single-cell biomarker extraction via identifying paired cell
clusters across biological conditions with contrastive siamese networks [0.9591674293850556]
scBeacon is a framework built upon a deep contrastive siamese network.
scBeacon adeptly identifies matched cell populations across varied conditions.
Comprehensive evaluations validate scBeacon's superiority over existing single-cell differential gene analysis tools.
arXiv Detail & Related papers (2023-11-05T08:27:24Z) - ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab [67.24684071577211]
The challenge of replicating research results has posed a significant impediment to the field of molecular biology.
We first curate a comprehensive multimodal dataset, named ProBio, as an initial step towards this objective.
Next, we devise two challenging benchmarks, transparent solution tracking and multimodal action recognition, to emphasize the unique characteristics and difficulties associated with activity understanding in BioLab settings.
arXiv Detail & Related papers (2023-11-01T14:44:01Z) - Lymphocyte Classification in Hyperspectral Images of Ovarian Cancer
Tissue Biopsy Samples [94.37521840642141]
We present a machine learning pipeline to segment white blood cell pixels in hyperspectral images of biopsy cores.
These cells are clinically important for diagnosis, but some prior work has struggled to incorporate them due to difficulty obtaining precise pixel labels.
arXiv Detail & Related papers (2022-03-23T00:58:27Z) - Deep neural networks approach to microbial colony detection -- a
comparative analysis [52.77024349608834]
This study investigates the performance of three deep learning approaches for object detection on the AGAR dataset.
The achieved results may serve as a benchmark for future experiments.
arXiv Detail & Related papers (2021-08-23T12:06:00Z) - Preventing dataset shift from breaking machine-learning biomarkers [0.6138671548064355]
A good biomarker is one that gives reliable detection of the corresponding condition.
Biomarkers are often extracted from a cohort that differs from the target population.
Such a mismatch, known as a dataset shift, can undermine the application of the biomarker to new individuals.
arXiv Detail & Related papers (2021-07-21T08:54:23Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z) - Deep learning approach to describe and classify fungi microscopic images [4.759323753598067]
We apply a machine learning approach based on deep neural networks and Fisher Vector to classify microscopic images of various fungi species.
Our approach has the potential to make the last stage of biochemical identification redundant, shortening the identification process by 2-3 days, and reducing the cost of the diagnosis.
arXiv Detail & Related papers (2020-05-24T15:15:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.