Related papers: MOSAIC: A Multi-View 2.5D Organ Slice Selector with Cross-Attentional Reasoning for Anatomically-Aware CT Localization in Medical Organ Segmentation

MOSAIC: A Multi-View 2.5D Organ Slice Selector with Cross-Attentional Reasoning for Anatomically-Aware CT Localization in Medical Organ Segmentation

URL: http://arxiv.org/abs/2505.10672v1
Date: Thu, 15 May 2025 19:32:28 GMT
Title: MOSAIC: A Multi-View 2.5D Organ Slice Selector with Cross-Attentional Reasoning for Anatomically-Aware CT Localization in Medical Organ Segmentation
Authors: Hania Ghouse, Muzammil Behzad,
Abstract summary: Existing 3D segmentation approaches are computationally and memory intensive, often processing entire volumes that contain many anatomically irrelevant slices.<n>We propose a novel, anatomically-aware slice selector pipeline that reduces input volume prior to segmentation.<n>Our proposed model acts as an "expert" in anatomical localization, reasoning over multi-view representations to selectively retain slices with high structural relevance.
Score: 0.8747606955991707
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Efficient and accurate multi-organ segmentation from abdominal CT volumes is a fundamental challenge in medical image analysis. Existing 3D segmentation approaches are computationally and memory intensive, often processing entire volumes that contain many anatomically irrelevant slices. Meanwhile, 2D methods suffer from class imbalance and lack cross-view contextual awareness. To address these limitations, we propose a novel, anatomically-aware slice selector pipeline that reduces input volume prior to segmentation. Our unified framework introduces a vision-language model (VLM) for cross-view organ presence detection using fused tri-slice (2.5D) representations from axial, sagittal, and coronal planes. Our proposed model acts as an "expert" in anatomical localization, reasoning over multi-view representations to selectively retain slices with high structural relevance. This enables spatially consistent filtering across orientations while preserving contextual cues. More importantly, since standard segmentation metrics such as Dice or IoU fail to measure the spatial precision of such slice selection, we introduce a novel metric, Slice Localization Concordance (SLC), which jointly captures anatomical coverage and spatial alignment with organ-centric reference slices. Unlike segmentation-specific metrics, SLC provides a model-agnostic evaluation of localization fidelity. Our model offers substantial improvement gains against several baselines across all organs, demonstrating both accurate and reliable organ-focused slice filtering. These results show that our method enables efficient and spatially consistent organ filtering, thereby significantly reducing downstream segmentation cost while maintaining high anatomical fidelity.

Related papers

A Continual Learning-driven Model for Accurate and Generalizable Segmentation of Clinically Comprehensive and Fine-grained Whole-body Anatomies in CT [67.34586036959793]
There is no fully annotated CT dataset with all anatomies delineated for training.<n>We propose a novel continual learning-driven CT model that can segment complete anatomies.<n>Our single unified CT segmentation model, CL-Net, can highly accurately segment a clinically comprehensive set of 235 fine-grained whole-body anatomies.
arXiv Detail & Related papers (2025-03-16T23:55:02Z)
GASA-UNet: Global Axial Self-Attention U-Net for 3D Medical Image Segmentation [8.939740171704388]
We introduce a refined U-Net-like model featuring a novel Global Axial Self-Attention (GASA) block. This block processes image data as a 3D entity, with each 2D plane representing a different anatomical cross-section. Our model has demonstrated promising improvements in segmentation performance, particularly for smaller anatomical structures.
arXiv Detail & Related papers (2024-09-20T01:23:53Z)
CT-based brain ventricle segmentation via diffusion Schrödinger Bridge without target domain ground truths [0.9720086191214947]
Efficient and accurate brain ventricle segmentation from clinical CT scans is critical for emergency surgeries like ventriculostomy. We introduce a novel uncertainty-aware ventricle segmentation technique without the need of CT segmentation ground truths. Our method employs the diffusion Schr"odinger Bridge and an attention recurrent residual U-Net to capitalize on unpaired CT and MRI scans.
arXiv Detail & Related papers (2024-05-28T15:17:58Z)
Teaching AI the Anatomy Behind the Scan: Addressing Anatomical Flaws in Medical Image Segmentation with Learnable Prior [34.54360931760496]
Key anatomical features, such as the number of organs, their shapes and relative positions, are crucial for building a robust multi-organ segmentation model. We introduce a novel architecture called the Anatomy-Informed Network (AIC-Net) AIC-Net incorporates a learnable input termed "Anatomical Prior", which can be adapted to patient-specific anatomy.
arXiv Detail & Related papers (2024-03-27T10:46:24Z)
Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis. We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z)
Accurate Fine-Grained Segmentation of Human Anatomy in Radiographs via Volumetric Pseudo-Labeling [66.75096111651062]
We created a large-scale dataset of 10,021 thoracic CTs with 157 labels. We applied an ensemble of 3D anatomy segmentation models to extract anatomical pseudo-labels. Our resulting segmentation models demonstrated remarkable performance on CXR.
arXiv Detail & Related papers (2023-06-06T18:01:08Z)
Med-Query: Steerable Parsing of 9-DoF Medical Anatomies with Query Embedding [14.901279446640393]
We propose a steerable, robust, and efficient computing framework for detection, identification, and segmentation of anatomies in CT scans.<n>Considering the complicated shapes, sizes, and orientations of anatomies, we present a nine degrees of freedom (9-DoF) pose estimation solution in full 3D space.<n>We have validated our method on three medical imaging parsing tasks: ribs, spine, and abdominal organs.
arXiv Detail & Related papers (2022-12-05T04:04:21Z)
Large-Kernel Attention for 3D Medical Image Segmentation [14.76728117630242]
In this paper, a novel large- kernel (LK) attention module is proposed to achieve accurate multi-organ segmentation and tumor segmentation. The advantages of convolution and self-attention are combined in the proposed LK attention module, including local contextual information, long-range dependence, and channel adaptation. The module also decomposes the LK convolution to optimize the computational cost and can be easily incorporated into FCNs such as U-Net.
arXiv Detail & Related papers (2022-07-19T16:32:55Z)
Multi-organ Segmentation Network with Adversarial Performance Validator [10.775440368500416]
This paper introduces an adversarial performance validation network into a 2D-to-3D segmentation framework. The proposed network converts the 2D-coarse result to 3D high-quality segmentation masks in a coarse-to-fine manner, allowing joint optimization to improve segmentation accuracy. Experiments on the NIH pancreas segmentation dataset demonstrate the proposed network achieves state-of-the-art accuracy on small organ segmentation and outperforms the previous best.
arXiv Detail & Related papers (2022-04-16T18:00:29Z)
A unified 3D framework for Organs at Risk Localization and Segmentation for Radiation Therapy Planning [56.52933974838905]
Current medical workflow requires manual delineation of organs-at-risk (OAR) In this work, we aim to introduce a unified 3D pipeline for OAR localization-segmentation. Our proposed framework fully enables the exploitation of 3D context information inherent in medical imaging.
arXiv Detail & Related papers (2022-03-01T17:08:41Z)
Rethinking the Extraction and Interaction of Multi-Scale Features for Vessel Segmentation [53.187152856583396]
We propose a novel deep learning model called PC-Net to segment retinal vessels and major arteries in 2D fundus image and 3D computed tomography angiography (CTA) scans. In PC-Net, the pyramid squeeze-and-excitation (PSE) module introduces spatial information to each convolutional block, boosting its ability to extract more effective multi-scale features.
arXiv Detail & Related papers (2020-10-09T08:22:54Z)
Deep Reinforcement Learning for Organ Localization in CT [59.23083161858951]
We propose a deep reinforcement learning approach for organ localization in CT. In this work, an artificial agent is actively self-taught to localize organs in CT by learning from its asserts and mistakes. Our method can use as a plug-and-play module for localizing any organ of interest.
arXiv Detail & Related papers (2020-05-11T10:06:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.