Related papers: MIIDL: a Python package for microbial biomarkers identification powered by interpretable deep learning

MIIDL: a Python package for microbial biomarkers identification powered by interpretable deep learning

URL: http://arxiv.org/abs/2109.12204v1
Date: Fri, 24 Sep 2021 21:30:10 GMT
Title: MIIDL: a Python package for microbial biomarkers identification powered by interpretable deep learning
Authors: Jian Jiang
Abstract summary: We present MIIDL, a Python package for the identification of microbial biomarkers based on interpretable deep learning. MIIDL innovatively applies convolutional neural networks, a variety of interpretability algorithms and plenty of pre-processing methods to provide a one-stop and robust pipeline for microbial biomarkers identification from high-dimensional and sparse data sets.
Score: 5.749346757892117
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Detecting microbial biomarkers used to predict disease phenotypes and clinical outcomes is crucial for disease early-stage screening and diagnosis. Most methods for biomarker identification are linear-based, which is very limited as biological processes are rarely fully linear. The introduction of machine learning to this field tends to bring a promising solution. However, identifying microbial biomarkers in an interpretable, data-driven and robust manner remains challenging. We present MIIDL, a Python package for the identification of microbial biomarkers based on interpretable deep learning. MIIDL innovatively applies convolutional neural networks, a variety of interpretability algorithms and plenty of pre-processing methods to provide a one-stop and robust pipeline for microbial biomarkers identification from high-dimensional and sparse data sets.

Related papers

Large Language Models for Bioinformatics [58.892165394487414]
This survey focuses on the evolution, classification, and distinguishing features of bioinformatics-specific language models (BioLMs) We explore the wide-ranging applications of BioLMs in critical areas such as disease diagnosis, drug discovery, and vaccine development. We identify key challenges and limitations inherent in BioLMs, including data privacy and security concerns, interpretability issues, biases in training data and model outputs, and domain adaptation complexities.
arXiv Detail & Related papers (2025-01-10T01:43:05Z)
Graph-Based Biomarker Discovery and Interpretation for Alzheimer's Disease [1.859931123372708]
Early diagnosis and discovery of therapeutic drug targets are crucial objectives for the effective management of Alzheimer's Disease (AD) Recent blood tests have shown promise in diagnosing AD and highlighting possible biomarkers that can be used as drug targets for AD management. Here, we introduce BRAIN, a novel machine learning framework to jointly optimize the diagnostic accuracy and biomarker discovery processes.
arXiv Detail & Related papers (2024-11-27T22:45:19Z)
How quantum computing can enhance biomarker discovery for multi-factorial diseases [0.14511217610551727]
Quantum algorithms, particularly in machine learning, are mapped to key applications in biomarker discovery. The opportunities and challenges associated with the algorithms and applications are discussed. An outlook is provided concerning open research challenges.
arXiv Detail & Related papers (2024-11-15T16:50:05Z)
Revolutionizing Biomarker Discovery: Leveraging Generative AI for Bio-Knowledge-Embedded Continuous Space Exploration [20.419747013569268]
We propose a new biomarker identification framework with two important modules: training data preparation and embedding-optimization-generation. The first module uses a multi-agent system to automatically collect pairs of biomarker subsets and their corresponding prediction accuracy as training data. The second module employs an encoder-evaluator-decoder learning paradigm to compress the knowledge of the collected data into a continuous space.
arXiv Detail & Related papers (2024-09-23T23:36:30Z)
MMIL: A novel algorithm for disease associated cell type discovery [58.044870442206914]
Single-cell datasets often lack individual cell labels, making it challenging to identify cells associated with disease. We introduce Mixture Modeling for Multiple Learning Instance (MMIL), an expectation method that enables the training and calibration of cell-level classifiers.
arXiv Detail & Related papers (2024-06-12T15:22:56Z)
An Evaluation of Large Language Models in Bioinformatics Research [52.100233156012756]
We study the performance of large language models (LLMs) on a wide spectrum of crucial bioinformatics tasks. These tasks include the identification of potential coding regions, extraction of named entities for genes and proteins, detection of antimicrobial and anti-cancer peptides, molecular optimization, and resolution of educational bioinformatics problems. Our findings indicate that, given appropriate prompts, LLMs like GPT variants can successfully handle most of these tasks.
arXiv Detail & Related papers (2024-02-21T11:27:31Z)
scBeacon: single-cell biomarker extraction via identifying paired cell clusters across biological conditions with contrastive siamese networks [0.9591674293850556]
scBeacon is a framework built upon a deep contrastive siamese network. scBeacon adeptly identifies matched cell populations across varied conditions. Comprehensive evaluations validate scBeacon's superiority over existing single-cell differential gene analysis tools.
arXiv Detail & Related papers (2023-11-05T08:27:24Z)
ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab [67.24684071577211]
The challenge of replicating research results has posed a significant impediment to the field of molecular biology. We first curate a comprehensive multimodal dataset, named ProBio, as an initial step towards this objective. Next, we devise two challenging benchmarks, transparent solution tracking and multimodal action recognition, to emphasize the unique characteristics and difficulties associated with activity understanding in BioLab settings.
arXiv Detail & Related papers (2023-11-01T14:44:01Z)
Lymphocyte Classification in Hyperspectral Images of Ovarian Cancer Tissue Biopsy Samples [94.37521840642141]
We present a machine learning pipeline to segment white blood cell pixels in hyperspectral images of biopsy cores. These cells are clinically important for diagnosis, but some prior work has struggled to incorporate them due to difficulty obtaining precise pixel labels.
arXiv Detail & Related papers (2022-03-23T00:58:27Z)
Deep neural networks approach to microbial colony detection -- a comparative analysis [52.77024349608834]
This study investigates the performance of three deep learning approaches for object detection on the AGAR dataset. The achieved results may serve as a benchmark for future experiments.
arXiv Detail & Related papers (2021-08-23T12:06:00Z)
Preventing dataset shift from breaking machine-learning biomarkers [0.6138671548064355]
A good biomarker is one that gives reliable detection of the corresponding condition. Biomarkers are often extracted from a cohort that differs from the target population. Such a mismatch, known as a dataset shift, can undermine the application of the biomarker to new individuals.
arXiv Detail & Related papers (2021-07-21T08:54:23Z)
Data-Driven Logistic Regression Ensembles With Applications in Genomics [0.0]
We propose a new approach for dealing with high-dimensional binary classification problems that combines ideas from regularization and ensembling. We demonstrate the good performance of our method in terms of prediction accuracy and identification of key biomarkers using several medical datasets involving common diseases such as cancer, multiple sclerosis and psoriasis.
arXiv Detail & Related papers (2021-02-17T05:57:26Z)
G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers. We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.