Tokensome: Towards a Genetic Vision-Language GPT for Explainable and Cognitive Karyotyping
- URL: http://arxiv.org/abs/2403.11073v1
- Date: Sun, 17 Mar 2024 03:38:50 GMT
- Title: Tokensome: Towards a Genetic Vision-Language GPT for Explainable and Cognitive Karyotyping
- Authors: Haoxi Zhang, Xinxu Zhang, Yuanxin Lin, Maiqi Wang, Yi Lai, Yu Wang, Linfeng Yu, Yufeng Xu, Ran Cheng, Edward Szczerbicki,
- Abstract summary: Tokensome is a novel vision-language model based on chromosome tokenization for explainable and cognitive karyotyping.
Tokensome elevates the method from the conventional visual perception layer to the cognitive decision-making layer.
- Score: 7.101960527853822
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automatic karyotype analysis is often defined as a visual perception task focused solely on chromosomal object-level modeling. This definition has led most existing methods to overlook componential and holistic information, significantly constraining model performance. Moreover, the lack of interpretability in current technologies hinders clinical adoption. In this paper, we introduce Tokensome, a novel vision-language model based on chromosome tokenization for explainable and cognitive karyotyping. Tokensome elevates the method from the conventional visual perception layer to the cognitive decision-making layer. This elevation enables the integration of domain knowledge and cognitive reasoning via knowledge graphs and LLMs, markedly enhancing model's explainability and facilitating abnormality detection.
Related papers
- Analyzing the Effect of $k$-Space Features in MRI Classification Models [0.0]
We have developed an explainable AI methodology tailored for medical imaging.
We employ a Convolutional Neural Network (CNN) that analyzes MRI scans across both image and frequency domains.
This approach not only enhances early training efficiency but also deepens our understanding of how additional features impact the model predictions.
arXiv Detail & Related papers (2024-09-20T15:43:26Z) - Aligning Human Knowledge with Visual Concepts Towards Explainable Medical Image Classification [8.382606243533942]
We introduce a simple yet effective framework, Explicd, towards Explainable language-informed criteria-based diagnosis.
By leveraging a pretrained vision-language model, Explicd injects these criteria into the embedding space as knowledge anchors.
The final diagnostic outcome is determined based on the similarity scores between the encoded visual concepts and the textual criteria embeddings.
arXiv Detail & Related papers (2024-06-08T23:23:28Z) - Knowledge-enhanced Visual-Language Pretraining for Computational Pathology [68.6831438330526]
We consider the problem of visual representation learning for computational pathology, by exploiting large-scale image-text pairs gathered from public resources.
We curate a pathology knowledge tree that consists of 50,470 informative attributes for 4,718 diseases requiring pathology diagnosis from 32 human tissues.
arXiv Detail & Related papers (2024-04-15T17:11:25Z) - MLIP: Enhancing Medical Visual Representation with Divergence Encoder
and Knowledge-guided Contrastive Learning [48.97640824497327]
We propose a novel framework leveraging domain-specific medical knowledge as guiding signals to integrate language information into the visual domain through image-text contrastive learning.
Our model includes global contrastive learning with our designed divergence encoder, local token-knowledge-patch alignment contrastive learning, and knowledge-guided category-level contrastive learning with expert knowledge.
Notably, MLIP surpasses state-of-the-art methods even with limited annotated data, highlighting the potential of multimodal pre-training in advancing medical representation learning.
arXiv Detail & Related papers (2024-02-03T05:48:50Z) - Multi-modal vision-language model for generalizable annotation-free pathology localization and clinical diagnosis [18.958171123895866]
Defining pathologies automatically from medical images aids the understanding of the emergence and progression of diseases.
Existing deep learning models heavily rely on expert annotations and lack generalization capabilities in open clinical environments.
We present a vision-language model for.
lingering-free pathology (AFLoc)
We demonstrate that AFLoc surpasses state-of-the-art methods in pathology and classification, and even outperforms the human benchmark in locating 5 different pathologies.
arXiv Detail & Related papers (2024-01-04T03:09:39Z) - Robust and Interpretable Medical Image Classifiers via Concept
Bottleneck Models [49.95603725998561]
We propose a new paradigm to build robust and interpretable medical image classifiers with natural language concepts.
Specifically, we first query clinical concepts from GPT-4, then transform latent image features into explicit concepts with a vision-language model.
arXiv Detail & Related papers (2023-10-04T21:57:09Z) - A Novel Neural-symbolic System under Statistical Relational Learning [50.747658038910565]
We propose a general bi-level probabilistic graphical reasoning framework called GBPGR.
In GBPGR, the results of symbolic reasoning are utilized to refine and correct the predictions made by the deep learning models.
Our approach achieves high performance and exhibits effective generalization in both transductive and inductive tasks.
arXiv Detail & Related papers (2023-09-16T09:15:37Z) - NeuroExplainer: Fine-Grained Attention Decoding to Uncover Cortical
Development Patterns of Preterm Infants [73.85768093666582]
We propose an explainable geometric deep network dubbed NeuroExplainer.
NeuroExplainer is used to uncover altered infant cortical development patterns associated with preterm birth.
arXiv Detail & Related papers (2023-01-01T12:48:12Z) - Deep Learning Generates Synthetic Cancer Histology for Explainability
and Education [37.13457398561086]
Conditional generative adversarial networks (cGANs) are AI models that generate synthetic images.
We describe the use of a cGAN for explaining models trained to classify molecularly-subtyped tumors.
We show that clear, intuitive cGAN visualizations can reinforce and improve human understanding of histologic manifestations of tumor biology.
arXiv Detail & Related papers (2022-11-12T00:14:57Z) - Biologically-informed deep learning models for cancer: fundamental
trends for encoding and interpreting oncology data [0.0]
We provide a structured literature analysis focused on Deep Learning (DL) models used to support inference in cancer biology.
The work focuses on how existing models address the need for better dialogue with prior knowledge, biological plausibility and interpretability.
arXiv Detail & Related papers (2022-07-02T12:11:35Z) - Deep Collaborative Multi-Modal Learning for Unsupervised Kinship
Estimation [53.62256887837659]
Kinship verification is a long-standing research challenge in computer vision.
We propose a novel deep collaborative multi-modal learning (DCML) to integrate the underlying information presented in facial properties.
Our DCML method is always superior to some state-of-the-art kinship verification methods.
arXiv Detail & Related papers (2021-09-07T01:34:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.