Related papers: Zero-shot System for Automatic Body Region Detection for Volumetric CT and MR Images

Zero-shot System for Automatic Body Region Detection for Volumetric CT and MR Images

URL: http://arxiv.org/abs/2602.08717v1
Date: Mon, 09 Feb 2026 14:26:24 GMT
Title: Zero-shot System for Automatic Body Region Detection for Volumetric CT and MR Images
Authors: Farnaz Khun Jush, Grit Werner, Mark Klemens, Matthias Lenga,
Abstract summary: We investigate whether body region detection in CT and MR images can be achieved in a fully zero-shot manner by using knowledge embedded in large pre-trained foundation models.<n>We propose and systematically evaluate three training-free pipelines: (1) a segmentation-driven rule-based system, (2) a Multimodal Large Language Model (MLLM) guided by radiologist-defined rules, and (3) a segmentation-aware MLLM that combines visual input with explicit anatomical evidence.<n>All methods are evaluated on 887 heterogeneous CT and MR scans with manually verified anatomical region labels.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Reliable identification of anatomical body regions is a prerequisite for many automated medical imaging workflows, yet existing solutions remain heavily dependent on unreliable DICOM metadata. Current solutions mainly use supervised learning, which limits their applicability in many real-world scenarios. In this work, we investigate whether body region detection in volumetric CT and MR images can be achieved in a fully zero-shot manner by using knowledge embedded in large pre-trained foundation models. We propose and systematically evaluate three training-free pipelines: (1) a segmentation-driven rule-based system leveraging pre-trained multi-organ segmentation models, (2) a Multimodal Large Language Model (MLLM) guided by radiologist-defined rules, and (3) a segmentation-aware MLLM that combines visual input with explicit anatomical evidence. All methods are evaluated on 887 heterogeneous CT and MR scans with manually verified anatomical region labels. The segmentation-driven rule-based approach achieves the strongest and most consistent performance, with weighted F1-scores of 0.947 (CT) and 0.914 (MR), demonstrating robustness across modalities and atypical scan coverage. The MLLM performs competitively in visually distinctive regions, while the segmentation-aware MLLM reveals fundamental limitations.

Related papers

OmniCT: Towards a Unified Slice-Volume LVLM for Comprehensive CT Analysis [53.01523944168442]
Clinical interpretation relies on both slice-driven local features and volume-driven spatial representations.<n>Existing Large Vision-Language Models (LVLMs) remain fragmented in CT slice versus volumetric understanding.<n>We present OmniCT, a powerful unified slice-volume LVLM for CT scenarios.
arXiv Detail & Related papers (2026-02-18T00:42:41Z)
MedRegion-CT: Region-Focused Multimodal LLM for Comprehensive 3D CT Report Generation [1.6515663221123749]
We propose MedRegion-CT, a region-focused Multi-Modal Large Language Model (MLLM) framework, featuring three key innovations.<n>First, we introduce Region Representative ($R2$) Token Pooling, which utilizes a 2D-wise pretrained vision model to efficiently extract 3D CT features.<n>Second, a universal segmentation model generates pseudo-masks, which are then processed by a mask encoder to extract region-centric features.<n>Third, we leverage segmentation results to extract patient-specific attributions, including organ size, diameter, and locations.
arXiv Detail & Related papers (2025-06-29T06:08:55Z)
An Arbitrary-Modal Fusion Network for Volumetric Cranial Nerves Tract Segmentation [21.228897192093573]
We propose a novel arbitrary-modal fusion network for volumetric cranial nerves (CNs) tract segmentation, called CNTSeg-v2.<n>Our model encompasses an Arbitrary-Modal Collaboration Module (ACM) designed to effectively extract informative features from other auxiliary modalities.<n>Our CNTSeg-v2 achieves state-of-the-art segmentation performance, outperforming all competing methods.
arXiv Detail & Related papers (2025-05-05T06:00:41Z)
PathSegDiff: Pathology Segmentation using Diffusion model representations [63.20694440934692]
We propose PathSegDiff, a novel approach for histopathology image segmentation that leverages Latent Diffusion Models (LDMs) as pre-trained featured extractors.<n>Our method utilizes a pathology-specific LDM, guided by a self-supervised encoder, to extract rich semantic information from H&E stained histopathology images.<n>Our experiments demonstrate significant improvements over traditional methods on the BCSS and GlaS datasets.
arXiv Detail & Related papers (2025-04-09T14:58:21Z)
Foundation Model for Whole-Heart Segmentation: Leveraging Student-Teacher Learning in Multi-Modal Medical Imaging [0.510750648708198]
Whole-heart segmentation from CT and MRI scans is crucial for cardiovascular disease analysis.<n>Existing methods struggle with modality-specific biases and the need for extensive labeled datasets.<n>We propose a foundation model for whole-heart segmentation using a self-supervised learning framework based on a student-teacher architecture.
arXiv Detail & Related papers (2025-03-24T14:47:54Z)
Train-Free Segmentation in MRI with Cubical Persistent Homology [0.0]
We present a new framework for segmentation of MRI scans based on Topological Data Analysis (TDA)<n>The pipeline proceeds in three steps, first identifying the whole object to segment via automatic thresholding, then detecting a distinctive subset whose topology is known in advance, and finally deducing the various components of the segmentation.<n>We validate the framework on three applications: glioblastoma segmentation in brain MRI, myocardium in cardiac MRI, forming a cylinder, and cortical plate detection in fetal brain MRI, whose 2D slices are circles.
arXiv Detail & Related papers (2024-01-02T11:43:49Z)
LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets. We have collected approximately 1.3 million medical images from 55 publicly available datasets. LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z)
A unified 3D framework for Organs at Risk Localization and Segmentation for Radiation Therapy Planning [56.52933974838905]
Current medical workflow requires manual delineation of organs-at-risk (OAR) In this work, we aim to introduce a unified 3D pipeline for OAR localization-segmentation. Our proposed framework fully enables the exploitation of 3D context information inherent in medical imaging.
arXiv Detail & Related papers (2022-03-01T17:08:41Z)
Few-shot Medical Image Segmentation using a Global Correlation Network with Discriminative Embedding [60.89561661441736]
We propose a novel method for few-shot medical image segmentation. We construct our few-shot image segmentor using a deep convolutional network trained episodically. We enhance discriminability of deep embedding to encourage clustering of the feature domains of the same class.
arXiv Detail & Related papers (2020-12-10T04:01:07Z)
Unsupervised Region-based Anomaly Detection in Brain MRI with Adversarial Image Inpainting [4.019851137611981]
This paper proposes a fully automatic, unsupervised inpainting-based brain tumour segmentation system for T1-weighted MRI. First, a deep convolutional neural network (DCNN) is trained to reconstruct missing healthy brain regions. Then, anomalous regions are determined by identifying areas of highest reconstruction loss. We show the proposed system is able to segment various sized and abstract tumours and achieves a mean and standard deviation Dice score of 0.771 and 0.176, respectively.
arXiv Detail & Related papers (2020-10-05T12:13:44Z)
Deep Reinforcement Learning for Organ Localization in CT [59.23083161858951]
We propose a deep reinforcement learning approach for organ localization in CT. In this work, an artificial agent is actively self-taught to localize organs in CT by learning from its asserts and mistakes. Our method can use as a plug-and-play module for localizing any organ of interest.
arXiv Detail & Related papers (2020-05-11T10:06:13Z)
A Global Benchmark of Algorithms for Segmenting Late Gadolinium-Enhanced Cardiac Magnetic Resonance Imaging [90.29017019187282]
" 2018 Left Atrium Challenge" using 154 3D LGE-MRIs, currently the world's largest cardiac LGE-MRI dataset. Analyse of the submitted algorithms using technical and biological metrics was performed. Results show the top method achieved a dice score of 93.2% and a mean surface to a surface distance of 0.7 mm.
arXiv Detail & Related papers (2020-04-26T08:49:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.