RAPS-3D: Efficient interactive segmentation for 3D radiological imaging
- URL: http://arxiv.org/abs/2507.07730v1
- Date: Thu, 10 Jul 2025 13:08:57 GMT
- Title: RAPS-3D: Efficient interactive segmentation for 3D radiological imaging
- Authors: Théo Danielou, Daniel Tordjman, Pierre Manceron, Corentin Dancette,
- Abstract summary: Adapting 2D models to 3D typically involves autoregressive strategies, where predictions are propagated slice by slice.<n>We present a simplified 3D promptable segmentation method, inspired by SegVol, designed to reduce inference time and eliminate prompt management complexities.
- Score: 5.8497833718980345
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Promptable segmentation, introduced by the Segment Anything Model (SAM), is a promising approach for medical imaging, as it enables clinicians to guide and refine model predictions interactively. However, SAM's architecture is designed for 2D images and does not extend naturally to 3D volumetric data such as CT or MRI scans. Adapting 2D models to 3D typically involves autoregressive strategies, where predictions are propagated slice by slice, resulting in increased inference complexity. Processing large 3D volumes also requires significant computational resources, often leading existing 3D methods to also adopt complex strategies like sliding-window inference to manage memory usage, at the cost of longer inference times and greater implementation complexity. In this paper, we present a simplified 3D promptable segmentation method, inspired by SegVol, designed to reduce inference time and eliminate prompt management complexities associated with sliding windows while achieving state-of-the-art performance.
Related papers
- Bootstraping Clustering of Gaussians for View-consistent 3D Scene Understanding [59.51535163599723]
FreeGS is an unsupervised semantic-embedded 3DGS framework that achieves view-consistent 3D scene understanding without the need for 2D labels.<n>FreeGS performs comparably to state-of-the-art methods while avoiding the complex data preprocessing workload.
arXiv Detail & Related papers (2024-11-29T08:52:32Z) - Large Spatial Model: End-to-end Unposed Images to Semantic 3D [79.94479633598102]
Large Spatial Model (LSM) processes unposed RGB images directly into semantic radiance fields.
LSM simultaneously estimates geometry, appearance, and semantics in a single feed-forward operation.
It can generate versatile label maps by interacting with language at novel viewpoints.
arXiv Detail & Related papers (2024-10-24T17:54:42Z) - SegFormer3D: an Efficient Transformer for 3D Medical Image Segmentation [0.13654846342364302]
We present SegFormer3D, a hierarchical Transformer that calculates attention across multiscale volumetric features.
SegFormer3D avoids complex decoders and uses an all-MLP decoder to aggregate local and global attention features.
We benchmark SegFormer3D against the current SOTA models on three widely used datasets.
arXiv Detail & Related papers (2024-04-15T22:12:05Z) - Towards a Comprehensive, Efficient and Promptable Anatomic Structure Segmentation Model using 3D Whole-body CT Scans [23.573958232965104]
Segment anything model (SAM) demonstrates strong generalization ability on natural image segmentation.<n>We propose a comprehensive and scalable 3D SAM model for whole-body CT segmentation, named CT-SAM3D.<n>CT-SAM3D is trained using a curated dataset of 1204 CT scans containing 107 whole-body anatomies.
arXiv Detail & Related papers (2024-03-22T09:40:52Z) - Spatiotemporal Modeling Encounters 3D Medical Image Analysis:
Slice-Shift UNet with Multi-View Fusion [0.0]
We propose a new 2D-based model dubbed Slice SHift UNet which encodes three-dimensional features at 2D CNN's complexity.
More precisely multi-view features are collaboratively learned by performing 2D convolutions along the three planes of a volume.
The effectiveness of our approach is validated in Multi-Modality Abdominal Multi-Organ axis (AMOS) and Multi-Atlas Labeling Beyond the Cranial Vault (BTCV) datasets.
arXiv Detail & Related papers (2023-07-24T14:53:23Z) - Interpretable 2D Vision Models for 3D Medical Images [47.75089895500738]
This study proposes a simple approach of adapting 2D networks with an intermediate feature representation for processing 3D images.
We show on all 3D MedMNIST datasets as benchmark and two real-world datasets consisting of several hundred high-resolution CT or MRI scans that our approach performs on par with existing methods.
arXiv Detail & Related papers (2023-07-13T08:27:09Z) - SeMLaPS: Real-time Semantic Mapping with Latent Prior Networks and
Quasi-Planar Segmentation [53.83313235792596]
We present a new methodology for real-time semantic mapping from RGB-D sequences.
It combines a 2D neural network and a 3D network based on a SLAM system with 3D occupancy mapping.
Our system achieves state-of-the-art semantic mapping quality within 2D-3D networks-based systems.
arXiv Detail & Related papers (2023-06-28T22:36:44Z) - GLEAM: Greedy Learning for Large-Scale Accelerated MRI Reconstruction [50.248694764703714]
Unrolled neural networks have recently achieved state-of-the-art accelerated MRI reconstruction.
These networks unroll iterative optimization algorithms by alternating between physics-based consistency and neural-network based regularization.
We propose Greedy LEarning for Accelerated MRI reconstruction, an efficient training strategy for high-dimensional imaging settings.
arXiv Detail & Related papers (2022-07-18T06:01:29Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based
Perception [122.53774221136193]
State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution.
A natural remedy is to utilize the 3D voxelization and 3D convolution network.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
arXiv Detail & Related papers (2021-09-12T06:25:11Z) - Spatial Context-Aware Self-Attention Model For Multi-Organ Segmentation [18.76436457395804]
Multi-organ segmentation is one of most successful applications of deep learning in medical image analysis.
Deep convolutional neural nets (CNNs) have shown great promise in achieving clinically applicable image segmentation performance on CT or MRI images.
We propose a new framework for combining 3D and 2D models, in which the segmentation is realized through high-resolution 2D convolutions.
arXiv Detail & Related papers (2020-12-16T21:39:53Z) - Cylinder3D: An Effective 3D Framework for Driving-scene LiDAR Semantic
Segmentation [87.54570024320354]
State-of-the-art methods for large-scale driving-scene LiDAR semantic segmentation often project and process the point clouds in the 2D space.
A straightforward solution to tackle the issue of 3D-to-2D projection is to keep the 3D representation and process the points in the 3D space.
We develop a 3D cylinder partition and a 3D cylinder convolution based framework, termed as Cylinder3D, which exploits the 3D topology relations and structures of driving-scene point clouds.
arXiv Detail & Related papers (2020-08-04T13:56:19Z) - Recalibrating 3D ConvNets with Project & Excite [6.11737116137921]
Convolutional Neural Networks (F-CNNs) achieve state-of-the-art performance for segmentation tasks in computer vision and medical imaging.
We extend existing 2D recalibration methods to 3D and propose a generic compress-process-recalibrate pipeline for easy comparison.
We demonstrate that PE modules can be easily integrated into 3D F-CNNs, boosting performance up to 0.3 in Dice Score and outperforming 3D extensions of other recalibration blocks.
arXiv Detail & Related papers (2020-02-25T16:07:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.