Leveraging Textual Anatomical Knowledge for Class-Imbalanced Semi-Supervised Multi-Organ Segmentation
- URL: http://arxiv.org/abs/2501.13470v1
- Date: Thu, 23 Jan 2025 08:40:54 GMT
- Title: Leveraging Textual Anatomical Knowledge for Class-Imbalanced Semi-Supervised Multi-Organ Segmentation
- Authors: Yuliang Gu, Weilun Tsao, Bo Du, Thierry GĂ©raud, Yongchao Xu,
- Abstract summary: Annotating 3D medical images demands substantial time and expertise.
The complex anatomical structures of organs often lead to significant class imbalances.
We propose a novel approach that integrates textual anatomical knowledge (TAK) into the segmentation model.
- Score: 29.70206595766246
- License:
- Abstract: Annotating 3D medical images demands substantial time and expertise, driving the adoption of semi-supervised learning (SSL) for segmentation tasks. However, the complex anatomical structures of organs often lead to significant class imbalances, posing major challenges for deploying SSL in real-world scenarios. Despite the availability of valuable prior information, such as inter-organ relative positions and organ shape priors, existing SSL methods have yet to fully leverage these insights. To address this gap, we propose a novel approach that integrates textual anatomical knowledge (TAK) into the segmentation model. Specifically, we use GPT-4o to generate textual descriptions of anatomical priors, which are then encoded using a CLIP-based model. These encoded priors are injected into the segmentation model as parameters of the segmentation head. Additionally, contrastive learning is employed to enhance the alignment between textual priors and visual features. Extensive experiments demonstrate the superior performance of our method, significantly surpassing state-of-the-art approaches. The source code will be available at: https://github.com/Lunn88/TAK-Semi.
Related papers
- Fake It Till You Make It: Using Synthetic Data and Domain Knowledge for Improved Text-Based Learning for LGE Detection [11.532639713283226]
We use strategies rooted in domain knowledge to train a model for LGE detection using text from clinical reports.
We standardize the orientation of the images in an anatomy-informed way to enable better alignment of spatial and text features.
ablation studies are carried out to elucidate the contributions of each design component to the overall performance of the model.
arXiv Detail & Related papers (2025-02-18T15:30:48Z) - Discriminative Image Generation with Diffusion Models for Zero-Shot Learning [53.44301001173801]
We present DIG-ZSL, a novel Discriminative Image Generation framework for Zero-Shot Learning.
We learn a discriminative class token (DCT) for each unseen class under the guidance of a pre-trained category discrimination model (CDM)
In this paper, the extensive experiments and visualizations on four datasets show that our DIG-ZSL: (1) generates diverse and high-quality images, (2) outperforms previous state-of-the-art nonhuman-annotated semantic prototype-based methods by a large margin, and (3) achieves comparable or better performance than baselines that leverage human-annot
arXiv Detail & Related papers (2024-12-23T02:18:54Z) - Knowledge-Guided Prompt Learning for Lifespan Brain MR Image Segmentation [53.70131202548981]
We present a two-step segmentation framework employing Knowledge-Guided Prompt Learning (KGPL) for brain MRI.
Specifically, we first pre-train segmentation models on large-scale datasets with sub-optimal labels.
The introduction of knowledge-wise prompts captures semantic relationships between anatomical variability and biological processes.
arXiv Detail & Related papers (2024-07-31T04:32:43Z) - MedContext: Learning Contextual Cues for Efficient Volumetric Medical Segmentation [25.74088298769155]
We propose a universal training framework called MedContext for 3D medical segmentation.
Our approach effectively learns self supervised contextual cues jointly with the supervised voxel segmentation task.
The effectiveness of MedContext is validated across multiple 3D medical datasets and four state-of-the-art model architectures.
arXiv Detail & Related papers (2024-02-27T17:58:05Z) - Learnable Weight Initialization for Volumetric Medical Image Segmentation [66.3030435676252]
We propose a learnable weight-based hybrid medical image segmentation approach.
Our approach is easy to integrate into any hybrid model and requires no external training data.
Experiments on multi-organ and lung cancer segmentation tasks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-06-15T17:55:05Z) - Continual Learning for Abdominal Multi-Organ and Tumor Segmentation [15.983529525062938]
We propose an innovative architecture designed specifically for continuous organ and tumor segmentation.
Our proposed design involves replacing the conventional output layer with a suite of lightweight, class-specific heads.
These heads enable independent predictions for newly introduced and previously learned classes, effectively minimizing the impact of new classes on old ones.
arXiv Detail & Related papers (2023-06-01T17:59:57Z) - Prompting Language-Informed Distribution for Compositional Zero-Shot Learning [73.49852821602057]
Compositional zero-shot learning (CZSL) task aims to recognize unseen compositional visual concepts.
We propose a model by prompting the language-informed distribution, aka., PLID, for the task.
Experimental results on MIT-States, UT-Zappos, and C-GQA datasets show the superior performance of the PLID to the prior arts.
arXiv Detail & Related papers (2023-05-23T18:00:22Z) - Learning with Explicit Shape Priors for Medical Image Segmentation [17.110893665132423]
We propose a novel shape prior module (SPM) to promote the segmentation performance of UNet-based models.
Explicit shape priors consist of global and local shape priors.
Our proposed model achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-03-31T11:12:35Z) - ConCL: Concept Contrastive Learning for Dense Prediction Pre-training in
Pathology Images [47.43840961882509]
Self-supervised learning is appealing to such annotation-heavy tasks.
We first benchmark representative SSL methods for dense prediction tasks in pathology images.
We propose concept contrastive learning (ConCL), an SSL framework for dense pre-training.
arXiv Detail & Related papers (2022-07-14T08:38:17Z) - PGL: Prior-Guided Local Self-supervised Learning for 3D Medical Image
Segmentation [87.50205728818601]
We propose a PriorGuided Local (PGL) self-supervised model that learns the region-wise local consistency in the latent feature space.
Our PGL model learns the distinctive representations of local regions, and hence is able to retain structural information.
arXiv Detail & Related papers (2020-11-25T11:03:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.