BiomedParse: a biomedical foundation model for image parsing of everything everywhere all at once
- URL: http://arxiv.org/abs/2405.12971v3
- Date: Tue, 4 Jun 2024 18:16:52 GMT
- Title: BiomedParse: a biomedical foundation model for image parsing of everything everywhere all at once
- Authors: Theodore Zhao, Yu Gu, Jianwei Yang, Naoto Usuyama, Ho Hin Lee, Tristan Naumann, Jianfeng Gao, Angela Crabtree, Jacob Abel, Christine Moung-Wen, Brian Piening, Carlo Bifulco, Mu Wei, Hoifung Poon, Sheng Wang,
- Abstract summary: holistic image analysis comprises subtasks such as segmentation, detection, and recognition of relevant objects.
Here, we propose BiomedParse, a biomedical foundation model for imaging parsing that can jointly conduct segmentation, detection, and recognition for 82 object types across 9 imaging modalities.
Through joint learning, we can improve accuracy for individual tasks and enable novel applications such as segmenting all relevant objects in a noisy image through a text prompt.
- Score: 58.41069132627823
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Biomedical image analysis is fundamental for biomedical discovery in cell biology, pathology, radiology, and many other biomedical domains. Holistic image analysis comprises interdependent subtasks such as segmentation, detection, and recognition of relevant objects. Here, we propose BiomedParse, a biomedical foundation model for imaging parsing that can jointly conduct segmentation, detection, and recognition for 82 object types across 9 imaging modalities. Through joint learning, we can improve accuracy for individual tasks and enable novel applications such as segmenting all relevant objects in an image through a text prompt, rather than requiring users to laboriously specify the bounding box for each object. We leveraged readily available natural-language labels or descriptions accompanying those datasets and use GPT-4 to harmonize the noisy, unstructured text information with established biomedical object ontologies. We created a large dataset comprising over six million triples of image, segmentation mask, and textual description. On image segmentation, we showed that BiomedParse is broadly applicable, outperforming state-of-the-art methods on 102,855 test image-mask-label triples across 9 imaging modalities (everything). On object detection, which aims to locate a specific object of interest, BiomedParse again attained state-of-the-art performance, especially on objects with irregular shapes (everywhere). On object recognition, which aims to identify all objects in a given image along with their semantic types, we showed that BiomedParse can simultaneously segment and label all biomedical objects in an image (all at once). In summary, BiomedParse is an all-in-one tool for biomedical image analysis by jointly solving segmentation, detection, and recognition for all major biomedical image modalities, paving the path for efficient and accurate image-based biomedical discovery.
Related papers
- Leveraging Biomolecule and Natural Language through Multi-Modal
Learning: A Survey [75.47055414002571]
The integration of biomolecular modeling with natural language (BL) has emerged as a promising interdisciplinary area at the intersection of artificial intelligence, chemistry and biology.
We provide an analysis of recent advancements achieved through cross modeling of biomolecules and natural language.
arXiv Detail & Related papers (2024-03-03T14:59:47Z) - Integrating curation into scientific publishing to train AI models [1.6982459897303823]
We have embedded multimodal data curation into the academic publishing process to annotate segmented figure panels and captions.
The dataset, SourceData-NLP, contains more than 620,000 annotated biomedical entities.
We evaluate the utility of the dataset to train AI models using named-entity recognition, segmentation of figure captions into their constituent panels, and a novel context-dependent semantic task.
arXiv Detail & Related papers (2023-10-31T13:22:38Z) - LLaVA-Med: Training a Large Language-and-Vision Assistant for
Biomedicine in One Day [85.19963303642427]
We propose a cost-efficient approach for training a vision-language conversational assistant that can answer open-ended research questions of biomedical images.
The model first learns to align biomedical vocabulary using the figure-caption pairs as is, then learns to master open-ended conversational semantics.
This enables us to train a Large Language and Vision Assistant for BioMedicine in less than 15 hours (with eight A100s)
arXiv Detail & Related papers (2023-06-01T16:50:07Z) - BiomedCLIP: a multimodal biomedical foundation model pretrained from
fifteen million scientific image-text pairs [48.376109878173956]
We present PMC-15M, a novel dataset that is two orders of magnitude larger than existing biomedical multimodal datasets.
PMC-15M contains 15 million biomedical image-text pairs collected from 4.4 million scientific articles.
Based on PMC-15M, we have pretrained BiomedCLIP, a multimodal foundation model, with domain-specific adaptations tailored to biomedical vision-language processing.
arXiv Detail & Related papers (2023-03-02T02:20:04Z) - BioFors: A Large Biomedical Image Forensics Dataset [22.32517325828983]
We present BioFors -- the first dataset for benchmarking common biomedical image manipulations.
BioFors comprises 47,805 images extracted from 1,031 open-source research papers.
We benchmark BioFors on all tasks with suitable state-of-the-art algorithms.
arXiv Detail & Related papers (2021-08-30T02:39:13Z) - Hierarchical Semantic Segmentation using Psychometric Learning [17.417302703539367]
We develop a novel approach to collect segmentation annotations from experts based on psychometric testing.
Our method consists of the psychometric testing procedure, active query selection, query enhancement, and a deep metric learning model.
We show the merits of our method with evaluation on the synthetically generated image, aerial image and histology image.
arXiv Detail & Related papers (2021-07-07T13:38:33Z) - Common Limitations of Image Processing Metrics: A Picture Story [58.83274952067888]
This document focuses on biomedical image analysis problems that can be phrased as image-level classification, semantic segmentation, instance segmentation, or object detection task.
The current version is based on a Delphi process on metrics conducted by an international consortium of image analysis experts from more than 60 institutions worldwide.
arXiv Detail & Related papers (2021-04-12T17:03:42Z) - Semantic Segmentation of highly class imbalanced fully labelled 3D
volumetric biomedical images and unsupervised Domain Adaptation of the
pre-trained Segmentation Network to segment another fully unlabelled
Biomedical 3D Image stack [16.698880511349493]
We consider two cases where one dataset is fully labeled and the other dataset is assumed to be fully unlabelled.
We first perform semantic on the fully labeled isotropic biomedical source data (FIBSEM) and try to incorporate the trained model for segmenting the target unlabelled dataset(SNEMI3D)
arXiv Detail & Related papers (2020-03-13T06:01:18Z) - Panoptic Feature Fusion Net: A Novel Instance Segmentation Paradigm for
Biomedical and Biological Images [91.41909587856104]
We present a Panoptic Feature Fusion Net (PFFNet) that unifies the semantic and instance features in this work.
Our proposed PFFNet contains a residual attention feature fusion mechanism to incorporate the instance prediction with the semantic features.
It outperforms several state-of-the-art methods on various biomedical and biological datasets.
arXiv Detail & Related papers (2020-02-15T09:19:41Z) - Deep Semantic Segmentation of Natural and Medical Images: A Review [17.620924936500725]
The semantic image segmentation task consists of classifying each pixel of an image into an instance, where each instance corresponds to a class.
In the medical image analysis domain, image segmentation can be used for image-guided interventions, radiotherapy, or improved radiological diagnostics.
In this review, we categorize the leading deep learning-based medical and non-medical image segmentation solutions into six main groups of deep architectural, data synthesis-based, loss function-based, sequenced models, weakly supervised, and multi-task methods.
arXiv Detail & Related papers (2019-10-16T06:35:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.