Related papers: Training Like a Medical Resident: Context-Prior Learning Toward Universal Medical Image Segmentation

Training Like a Medical Resident: Context-Prior Learning Toward Universal Medical Image Segmentation

URL: http://arxiv.org/abs/2306.02416v3
Date: Sun, 7 Apr 2024 03:53:33 GMT
Title: Training Like a Medical Resident: Context-Prior Learning Toward Universal Medical Image Segmentation
Authors: Yunhe Gao, Zhuowei Li, Di Liu, Mu Zhou, Shaoting Zhang, Dimitris N. Metaxas,
Abstract summary: We propose a shift towards universal medical image segmentation, a paradigm aiming to build medical image understanding foundation models. We develop Hermes, a novel context-prior learning approach to address the challenges of data heterogeneity and annotation differences in medical image segmentation.
Score: 38.61227663176952
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A major focus of clinical imaging workflow is disease diagnosis and management, leading to medical imaging datasets strongly tied to specific clinical objectives. This scenario has led to the prevailing practice of developing task-specific segmentation models, without gaining insights from widespread imaging cohorts. Inspired by the training program of medical radiology residents, we propose a shift towards universal medical image segmentation, a paradigm aiming to build medical image understanding foundation models by leveraging the diversity and commonality across clinical targets, body regions, and imaging modalities. Towards this goal, we develop Hermes, a novel context-prior learning approach to address the challenges of data heterogeneity and annotation differences in medical image segmentation. In a large collection of eleven diverse datasets (2,438 3D images) across five modalities (CT, PET, T1, T2 and cine MRI) and multiple body regions, we demonstrate the merit of the universal paradigm over the traditional paradigm on addressing multiple tasks within a single model. By exploiting the synergy across tasks, Hermes achieves state-of-the-art performance on all testing datasets and shows superior model scalability. Results on two additional datasets reveals Hermes' strong performance for transfer learning, incremental learning, and generalization to downstream tasks. Hermes's learned priors demonstrate an appealing trait to reflect the intricate relations among tasks and modalities, which aligns with the established anatomical and imaging principles in radiology. The code is available: https://github.com/yhygao/universal-medical-image-segmentation.

Related papers

Medverse: A Universal Model for Full-Resolution 3D Medical Image Segmentation, Transformation and Enhancement [15.28003304776022]
In-context learning offers a promising paradigm for universal medical image analysis.<n>We present textbfMedverse, a universal ICL model for 3D medical imaging trained on 22 datasets.<n>Medverse employs a next-scale autoregressive in-context learning framework that progressively refines predictions from coarse to fine.
arXiv Detail & Related papers (2025-09-11T08:10:49Z)
Multimodal Causal-Driven Representation Learning for Generalizable Medical Image Segmentation [56.52520416420957]
We propose Multimodal Causal-Driven Representation Learning (MCDRL) to tackle domain generalization in medical image segmentation.<n>MCDRL consistently outperforms competing methods, yielding superior segmentation accuracy and exhibiting robust generalizability.
arXiv Detail & Related papers (2025-08-07T03:41:41Z)
MedFormer: Hierarchical Medical Vision Transformer with Content-Aware Dual Sparse Selection Attention [1.474723404975345]
We present MedFormer, an efficient medical vision transformer with two key ideas.<n>First, it employs a pyramid scaling structure as a versatile backbone for various medical image recognition tasks.<n>Second, it introduces a novel Dual Sparse Selection Attention (DSSA) with content awareness to improve computational efficiency.
arXiv Detail & Related papers (2025-07-03T09:51:45Z)
iMedImage Technical Report [5.0953390013898705]
Chromosome karyotype analysis is crucial for diagnosing hereditary diseases, yet detecting structural abnormalities remains challenging. We developed iMedImage, an end-to-end model for general medical image recognition, demonstrating strong performance across multiple imaging tasks.
arXiv Detail & Related papers (2025-03-27T03:25:28Z)
Dynamically evolving segment anything model with continuous learning for medical image segmentation [50.92344083895528]
We introduce EvoSAM, a dynamically evolving medical image segmentation model. EvoSAM continuously accumulates new knowledge from an ever-expanding array of scenarios and tasks. Experiments conducted by surgical clinicians on blood vessel segmentation confirm that EvoSAM enhances segmentation efficiency based on user prompts.
arXiv Detail & Related papers (2025-03-08T14:37:52Z)
Autoregressive Sequence Modeling for 3D Medical Image Representation [48.706230961589924]
We introduce a pioneering method for learning 3D medical image representations through an autoregressive sequence pre-training framework. Our approach various 3D medical images based on spatial, contrast, and semantic correlations, treating them as interconnected visual tokens within a token sequence.
arXiv Detail & Related papers (2024-09-13T10:19:10Z)
Do Vision Foundation Models Enhance Domain Generalization in Medical Image Segmentation? [10.20366295974822]
We introduce a novel decode head architecture, HQHSAM, which simply integrates elements from two state-of-the-art decoder heads, HSAM and HQSAM, to enhance segmentation performance. Our experiments on multiple datasets, encompassing various anatomies and modalities, reveal that FMs, particularly with the HQHSAM decode head, improve domain generalization for medical image segmentation.
arXiv Detail & Related papers (2024-09-12T11:41:35Z)
MOSMOS: Multi-organ segmentation facilitated by medical report supervision [10.396987980136602]
We propose a novel pre-training & fine-tuning framework for Multi-Organ Supervision (MOS) Specifically, we first introduce global contrastive learning to align medical image-report pairs in the pre-training stage. To remedy the discrepancy, we further leverage multi-label recognition to implicitly learn the semantic correspondence between image pixels and organ tags.
arXiv Detail & Related papers (2024-09-04T03:46:17Z)
Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training [99.2891802841936]
We introduce the Med-ST framework for fine-grained spatial and temporal modeling. For spatial modeling, Med-ST employs the Mixture of View Expert (MoVE) architecture to integrate different visual features from both frontal and lateral views. For temporal modeling, we propose a novel cross-modal bidirectional cycle consistency objective by forward mapping classification (FMC) and reverse mapping regression (RMR)
arXiv Detail & Related papers (2024-05-30T03:15:09Z)
Unified Medical Image Pre-training in Language-Guided Common Semantic Space [39.61770813855078]
We propose an Unified Medical Image Pre-training framework, namely UniMedI. UniMedI uses diagnostic reports as common semantic space to create unified representations for diverse modalities of medical images. We evaluate its performance on both 2D and 3D images across 10 different datasets.
arXiv Detail & Related papers (2023-11-24T22:01:12Z)
From CNN to Transformer: A Review of Medical Image Segmentation Models [7.3150850275578145]
Deep learning for medical image segmentation has become a prevalent trend. In this paper, we conduct a survey of the most representative four medical image segmentation models in recent years. We theoretically analyze the characteristics of these models and quantitatively evaluate their performance on two benchmark datasets.
arXiv Detail & Related papers (2023-08-10T02:48:57Z)
LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets. We have collected approximately 1.3 million medical images from 55 publicly available datasets. LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z)
Mine yOur owN Anatomy: Revisiting Medical Image Segmentation with Extremely Limited Labels [54.58539616385138]
We introduce a novel semi-supervised 2D medical image segmentation framework termed Mine yOur owN Anatomy (MONA) First, prior work argues that every pixel equally matters to the model training; we observe empirically that this alone is unlikely to define meaningful anatomical features. Second, we construct a set of objectives that encourage the model to be capable of decomposing medical images into a collection of anatomical features.
arXiv Detail & Related papers (2022-09-27T15:50:31Z)
Generalizable multi-task, multi-domain deep segmentation of sparse pediatric imaging datasets via multi-scale contrastive regularization and multi-joint anatomical priors [0.41998444721319217]
We propose to design a novel multi-task, multi-domain learning framework in which a single segmentation network is optimized over multiple datasets. We evaluate our contributions for performing bone segmentation using three scarce and pediatric imaging datasets of the ankle, knee, and shoulder joints.
arXiv Detail & Related papers (2022-07-27T12:59:16Z)
Few-shot Medical Image Segmentation using a Global Correlation Network with Discriminative Embedding [60.89561661441736]
We propose a novel method for few-shot medical image segmentation. We construct our few-shot image segmentor using a deep convolutional network trained episodically. We enhance discriminability of deep embedding to encourage clustering of the feature domains of the same class.
arXiv Detail & Related papers (2020-12-10T04:01:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.