Training Like a Medical Resident: Context-Prior Learning Toward Universal Medical Image Segmentation
- URL: http://arxiv.org/abs/2306.02416v3
- Date: Sun, 7 Apr 2024 03:53:33 GMT
- Title: Training Like a Medical Resident: Context-Prior Learning Toward Universal Medical Image Segmentation
- Authors: Yunhe Gao, Zhuowei Li, Di Liu, Mu Zhou, Shaoting Zhang, Dimitris N. Metaxas,
- Abstract summary: We propose a shift towards universal medical image segmentation, a paradigm aiming to build medical image understanding foundation models.
We develop Hermes, a novel context-prior learning approach to address the challenges of data heterogeneity and annotation differences in medical image segmentation.
- Score: 38.61227663176952
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A major focus of clinical imaging workflow is disease diagnosis and management, leading to medical imaging datasets strongly tied to specific clinical objectives. This scenario has led to the prevailing practice of developing task-specific segmentation models, without gaining insights from widespread imaging cohorts. Inspired by the training program of medical radiology residents, we propose a shift towards universal medical image segmentation, a paradigm aiming to build medical image understanding foundation models by leveraging the diversity and commonality across clinical targets, body regions, and imaging modalities. Towards this goal, we develop Hermes, a novel context-prior learning approach to address the challenges of data heterogeneity and annotation differences in medical image segmentation. In a large collection of eleven diverse datasets (2,438 3D images) across five modalities (CT, PET, T1, T2 and cine MRI) and multiple body regions, we demonstrate the merit of the universal paradigm over the traditional paradigm on addressing multiple tasks within a single model. By exploiting the synergy across tasks, Hermes achieves state-of-the-art performance on all testing datasets and shows superior model scalability. Results on two additional datasets reveals Hermes' strong performance for transfer learning, incremental learning, and generalization to downstream tasks. Hermes's learned priors demonstrate an appealing trait to reflect the intricate relations among tasks and modalities, which aligns with the established anatomical and imaging principles in radiology. The code is available: https://github.com/yhygao/universal-medical-image-segmentation.
Related papers
- Autoregressive Sequence Modeling for 3D Medical Image Representation [48.706230961589924]
We introduce a pioneering method for learning 3D medical image representations through an autoregressive sequence pre-training framework.
Our approach various 3D medical images based on spatial, contrast, and semantic correlations, treating them as interconnected visual tokens within a token sequence.
arXiv Detail & Related papers (2024-09-13T10:19:10Z) - Do Vision Foundation Models Enhance Domain Generalization in Medical Image Segmentation? [10.20366295974822]
We introduce a novel decode head architecture, HQHSAM, which simply integrates elements from two state-of-the-art decoder heads, HSAM and HQSAM, to enhance segmentation performance.
Our experiments on multiple datasets, encompassing various anatomies and modalities, reveal that FMs, particularly with the HQHSAM decode head, improve domain generalization for medical image segmentation.
arXiv Detail & Related papers (2024-09-12T11:41:35Z) - MOSMOS: Multi-organ segmentation facilitated by medical report supervision [10.396987980136602]
We propose a novel pre-training & fine-tuning framework for Multi-Organ Supervision (MOS)
Specifically, we first introduce global contrastive learning to align medical image-report pairs in the pre-training stage.
To remedy the discrepancy, we further leverage multi-label recognition to implicitly learn the semantic correspondence between image pixels and organ tags.
arXiv Detail & Related papers (2024-09-04T03:46:17Z) - Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training [99.2891802841936]
We introduce the Med-ST framework for fine-grained spatial and temporal modeling.
For spatial modeling, Med-ST employs the Mixture of View Expert (MoVE) architecture to integrate different visual features from both frontal and lateral views.
For temporal modeling, we propose a novel cross-modal bidirectional cycle consistency objective by forward mapping classification (FMC) and reverse mapping regression (RMR)
arXiv Detail & Related papers (2024-05-30T03:15:09Z) - Unified Medical Image Pre-training in Language-Guided Common Semantic Space [39.61770813855078]
We propose an Unified Medical Image Pre-training framework, namely UniMedI.
UniMedI uses diagnostic reports as common semantic space to create unified representations for diverse modalities of medical images.
We evaluate its performance on both 2D and 3D images across 10 different datasets.
arXiv Detail & Related papers (2023-11-24T22:01:12Z) - From CNN to Transformer: A Review of Medical Image Segmentation Models [7.3150850275578145]
Deep learning for medical image segmentation has become a prevalent trend.
In this paper, we conduct a survey of the most representative four medical image segmentation models in recent years.
We theoretically analyze the characteristics of these models and quantitatively evaluate their performance on two benchmark datasets.
arXiv Detail & Related papers (2023-08-10T02:48:57Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - Mine yOur owN Anatomy: Revisiting Medical Image Segmentation with Extremely Limited Labels [54.58539616385138]
We introduce a novel semi-supervised 2D medical image segmentation framework termed Mine yOur owN Anatomy (MONA)
First, prior work argues that every pixel equally matters to the model training; we observe empirically that this alone is unlikely to define meaningful anatomical features.
Second, we construct a set of objectives that encourage the model to be capable of decomposing medical images into a collection of anatomical features.
arXiv Detail & Related papers (2022-09-27T15:50:31Z) - Generalizable multi-task, multi-domain deep segmentation of sparse
pediatric imaging datasets via multi-scale contrastive regularization and
multi-joint anatomical priors [0.41998444721319217]
We propose to design a novel multi-task, multi-domain learning framework in which a single segmentation network is optimized over multiple datasets.
We evaluate our contributions for performing bone segmentation using three scarce and pediatric imaging datasets of the ankle, knee, and shoulder joints.
arXiv Detail & Related papers (2022-07-27T12:59:16Z) - Few-shot Medical Image Segmentation using a Global Correlation Network
with Discriminative Embedding [60.89561661441736]
We propose a novel method for few-shot medical image segmentation.
We construct our few-shot image segmentor using a deep convolutional network trained episodically.
We enhance discriminability of deep embedding to encourage clustering of the feature domains of the same class.
arXiv Detail & Related papers (2020-12-10T04:01:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.