Prompting Whole Slide Image Based Genetic Biomarker Prediction
- URL: http://arxiv.org/abs/2407.09540v1
- Date: Wed, 26 Jun 2024 11:05:46 GMT
- Title: Prompting Whole Slide Image Based Genetic Biomarker Prediction
- Authors: Ling Zhang, Boxiang Yun, Xingran Xie, Qingli Li, Xinxing Li, Yan Wang,
- Abstract summary: We propose a whole slide image (WSI) based genetic biomarker prediction method via prompting techniques.
We leverage large language models to generate medical prompts that serve as prior knowledge in extracting instances associated with genetic biomarkers.
We adopt a coarse-to-fine approach to mine biomarker information within the tumor microenvironment.
- Score: 13.764676578911526
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Prediction of genetic biomarkers, e.g., microsatellite instability and BRAF in colorectal cancer is crucial for clinical decision making. In this paper, we propose a whole slide image (WSI) based genetic biomarker prediction method via prompting techniques. Our work aims at addressing the following challenges: (1) extracting foreground instances related to genetic biomarkers from gigapixel WSIs, and (2) the interaction among the fine-grained pathological components in WSIs.Specifically, we leverage large language models to generate medical prompts that serve as prior knowledge in extracting instances associated with genetic biomarkers. We adopt a coarse-to-fine approach to mine biomarker information within the tumor microenvironment. This involves extracting instances related to genetic biomarkers using coarse medical prior knowledge, grouping pathology instances into fine-grained pathological components and mining their interactions. Experimental results on two colorectal cancer datasets show the superiority of our method, achieving 91.49% in AUC for MSI classification. The analysis further shows the clinical interpretability of our method. Code is publicly available at https://github.com/DeepMed-Lab-ECNU/PromptBio.
Related papers
- A Bioinformatic Approach Validated Utilizing Machine Learning Algorithms to Identify Relevant Biomarkers and Crucial Pathways in Gallbladder Cancer [2.3087284629747766]
Gallbladder cancer (GBC) is the most frequent cause of disease among biliary tract neoplasms.
Few recent studies have explored the roles of biomarkers in GBC.
We used machine learning (ML) and bioinformatics techniques to identify biomarkers in GBC.
arXiv Detail & Related papers (2024-10-18T12:51:19Z) - Predicting Genetic Mutation from Whole Slide Images via Biomedical-Linguistic Knowledge Enhanced Multi-label Classification [119.13058298388101]
We develop a Biological-knowledge enhanced PathGenomic multi-label Transformer to improve genetic mutation prediction performances.
BPGT first establishes a novel gene encoder that constructs gene priors by two carefully designed modules.
BPGT then designs a label decoder that finally performs genetic mutation prediction by two tailored modules.
arXiv Detail & Related papers (2024-06-05T06:42:27Z) - Genetic InfoMax: Exploring Mutual Information Maximization in
High-Dimensional Imaging Genetics Studies [50.11449968854487]
Genome-wide association studies (GWAS) are used to identify relationships between genetic variations and specific traits.
Representation learning for imaging genetics is largely under-explored due to the unique challenges posed by GWAS.
We introduce a trans-modal learning framework Genetic InfoMax (GIM) to address the specific challenges of GWAS.
arXiv Detail & Related papers (2023-09-26T03:59:21Z) - Applying BioBERT to Extract Germline Gene-Disease Associations for Building a Knowledge Graph from the Biomedical Literature [0.0]
This paper presents SimpleGermKG, an automatic knowledge graph construction approach that connects germline genes and diseases.
For the extraction of genes and diseases, we employ BioBERT, a pre-trained BERT model on biomedical corpora.
For semantic relationships between articles, genes, and diseases, we implemented a part-whole relation approach.
Our knowledge graph contains 297 genes, 130 diseases, and 46,747 triples.
arXiv Detail & Related papers (2023-09-11T18:05:12Z) - Studying Limits of Explainability by Integrated Gradients for Gene
Expression Models [3.220287168504093]
We show that ranking features by importance is not enough to robustly identify biomarkers.
As it is difficult to evaluate whether biomarkers reflect relevant causes without known ground truth, we simulate gene expression data by proposing a hierarchical model.
arXiv Detail & Related papers (2023-03-19T19:54:15Z) - Machine Learning Methods for Cancer Classification Using Gene Expression
Data: A Review [77.34726150561087]
Cancer is the second major cause of death after cardiovascular diseases.
Gene expression can play a fundamental role in the early detection of cancer.
This study reviews recent progress in gene expression analysis for cancer classification using machine learning methods.
arXiv Detail & Related papers (2023-01-28T15:03:03Z) - BioGPT: Generative Pre-trained Transformer for Biomedical Text
Generation and Mining [140.61707108174247]
We propose BioGPT, a domain-specific generative Transformer language model pre-trained on large scale biomedical literature.
We get 44.98%, 38.42% and 40.76% F1 score on BC5CDR, KD-DTI and DDI end-to-end relation extraction tasks respectively, and 78.2% accuracy on PubMedQA.
arXiv Detail & Related papers (2022-10-19T07:17:39Z) - Biomarker Gene Identification for Breast Cancer Classification [2.403531305046943]
The present work uses interpretable predictions made by the deep neural network employed for subtype classification to identify biomarkers.
The proposed algorithm led to the discovery of a set of 43 differentially expressed gene signatures.
arXiv Detail & Related papers (2021-11-10T06:38:50Z) - Exploring Genetic-histologic Relationships in Breast Cancer [28.91314299138311]
This work uses deep learning to predict genomic biomarkers from breast cancer histopathology images.
We outperform the existing works with a minimum improvement of 0.02 and a maximum of 0.13 AUROC scores across all tasks.
arXiv Detail & Related papers (2021-03-15T00:53:47Z) - Cancer Gene Profiling through Unsupervised Discovery [49.28556294619424]
We introduce a novel, automatic and unsupervised framework to discover low-dimensional gene biomarkers.
Our method is based on the LP-Stability algorithm, a high dimensional center-based unsupervised clustering algorithm.
Our signature reports promising results on distinguishing immune inflammatory and immune desert tumors.
arXiv Detail & Related papers (2021-02-11T09:04:45Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.