PathAsst: A Generative Foundation AI Assistant Towards Artificial
General Intelligence of Pathology
- URL: http://arxiv.org/abs/2305.15072v2
- Date: Mon, 19 Feb 2024 07:02:15 GMT
- Title: PathAsst: A Generative Foundation AI Assistant Towards Artificial
General Intelligence of Pathology
- Authors: Yuxuan Sun, Chenglu Zhu, Sunyi Zheng, Kai Zhang, Lin Sun, Zhongyi
Shui, Yunlong Zhang, Honglin Li, Lin Yang
- Abstract summary: We present PathAsst, a multimodal generative foundation AI assistant to revolutionize diagnostic and predictive analytics in pathology.
The development of PathAsst involves three pivotal steps: data acquisition, CLIP model adaptation, and the training of PathAsst's multimodal generative capabilities.
The experimental results of PathAsst show the potential of harnessing AI-powered generative foundation model to improve pathology diagnosis and treatment processes.
- Score: 15.419350834457136
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As advances in large language models (LLMs) and multimodal techniques
continue to mature, the development of general-purpose multimodal large
language models (MLLMs) has surged, offering significant applications in
interpreting natural images. However, the field of pathology has largely
remained untapped, particularly in gathering high-quality data and designing
comprehensive model frameworks. To bridge the gap in pathology MLLMs, we
present PathAsst, a multimodal generative foundation AI assistant to
revolutionize diagnostic and predictive analytics in pathology. The development
of PathAsst involves three pivotal steps: data acquisition, CLIP model
adaptation, and the training of PathAsst's multimodal generative capabilities.
Firstly, we collect over 207K high-quality pathology image-text pairs from
authoritative sources. Leveraging the advanced power of ChatGPT, we generate
over 180K instruction-following samples. Furthermore, we devise additional
instruction-following data specifically tailored for invoking eight
pathology-specific sub-models we prepared, allowing the PathAsst to effectively
collaborate with these models, enhancing its diagnostic ability. Secondly, by
leveraging the collected data, we construct PathCLIP, a pathology-dedicated
CLIP, to enhance PathAsst's capabilities in interpreting pathology images.
Finally, we integrate PathCLIP with the Vicuna-13b and utilize
pathology-specific instruction-tuning data to enhance the multimodal generation
capacity of PathAsst and bolster its synergistic interactions with sub-models.
The experimental results of PathAsst show the potential of harnessing
AI-powered generative foundation model to improve pathology diagnosis and
treatment processes.
Related papers
- UNICORN: A Deep Learning Model for Integrating Multi-Stain Data in Histopathology [2.9389205138207277]
UNICORN is a multi-modal transformer capable of processing multi-stain histopathology for atherosclerosis severity class prediction.
The architecture comprises a two-stage, end-to-end trainable model with specialized modules utilizing transformer self-attention blocks.
UNICORN achieved a classification accuracy of 0.67, outperforming other state-of-the-art models.
arXiv Detail & Related papers (2024-09-26T12:13:52Z) - PathInsight: Instruction Tuning of Multimodal Datasets and Models for Intelligence Assisted Diagnosis in Histopathology [7.87900104748629]
We have meticulously compiled a dataset of approximately 45,000 cases, covering over 6 different tasks.
We have fine-tuned multimodal large models, specifically LLaVA, Qwen-VL, InternLM, with this dataset to enhance instruction-based performance.
arXiv Detail & Related papers (2024-08-13T17:05:06Z) - PathoWAve: A Deep Learning-based Weight Averaging Method for Improving Domain Generalization in Histopathology Images [13.362177469092963]
We introduce Pathology Weight Averaging (PathoWAve) to tackle domain shift phenomenon in histopathology image analysis.
Our results on Camelyon17 WILDS dataset demonstrate PathoWAve's superiority over previous proposed methods.
arXiv Detail & Related papers (2024-06-21T23:25:44Z) - Knowledge-enhanced Visual-Language Pretraining for Computational Pathology [68.6831438330526]
We consider the problem of visual representation learning for computational pathology, by exploiting large-scale image-text pairs gathered from public resources.
We curate a pathology knowledge tree that consists of 50,470 informative attributes for 4,718 diseases requiring pathology diagnosis from 32 human tissues.
arXiv Detail & Related papers (2024-04-15T17:11:25Z) - HEALNet: Multimodal Fusion for Heterogeneous Biomedical Data [10.774128925670183]
This paper presents the Hybrid Early-fusion Attention Learning Network (HEALNet), a flexible multimodal fusion architecture.
We conduct multimodal survival analysis on Whole Slide Images and Multi-omic data on four cancer datasets from The Cancer Genome Atlas (TCGA)
HEALNet achieves state-of-the-art performance compared to other end-to-end trained fusion models.
arXiv Detail & Related papers (2023-11-15T17:06:26Z) - Domain-specific optimization and diverse evaluation of self-supervised
models for histopathology [9.450129206898115]
Task-specific deep learning models in histopathology offer promising opportunities for improving diagnosis, clinical research, and precision medicine.
We describe the development and evaluation of foundation models for histopathology via self-supervised learning (SSL)
arXiv Detail & Related papers (2023-10-20T03:38:07Z) - PathLDM: Text conditioned Latent Diffusion Model for Histopathology [62.970593674481414]
We introduce PathLDM, the first text-conditioned Latent Diffusion Model tailored for generating high-quality histopathology images.
Our approach fuses image and textual data to enhance the generation process.
We achieved a SoTA FID score of 7.64 for text-to-image generation on the TCGA-BRCA dataset, significantly outperforming the closest text-conditioned competitor with FID 30.1.
arXiv Detail & Related papers (2023-09-01T22:08:32Z) - Validating polyp and instrument segmentation methods in colonoscopy through Medico 2020 and MedAI 2021 Challenges [58.32937972322058]
"Medico automatic polyp segmentation (Medico 2020)" and "MedAI: Transparency in Medical Image (MedAI 2021)" competitions.
We present a comprehensive summary and analyze each contribution, highlight the strength of the best-performing methods, and discuss the possibility of clinical translations of such methods into the clinic.
arXiv Detail & Related papers (2023-07-30T16:08:45Z) - A multi-stage machine learning model on diagnosis of esophageal
manometry [50.591267188664666]
The framework includes deep-learning models at the swallow-level stage and feature-based machine learning models at the study-level stage.
This is the first artificial-intelligence-style model to automatically predict CC diagnosis of HRM study from raw multi-swallow data.
arXiv Detail & Related papers (2021-06-25T20:09:23Z) - Learning Binary Semantic Embedding for Histology Image Classification
and Retrieval [56.34863511025423]
We propose a novel method for Learning Binary Semantic Embedding (LBSE)
Based on the efficient and effective embedding, classification and retrieval are performed to provide interpretable computer-assisted diagnosis for histology images.
Experiments conducted on three benchmark datasets validate the superiority of LBSE under various scenarios.
arXiv Detail & Related papers (2020-10-07T08:36:44Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.