RadSAM: Segmenting 3D radiological images with a 2D promptable model
- URL: http://arxiv.org/abs/2504.20837v1
- Date: Tue, 29 Apr 2025 15:00:25 GMT
- Title: RadSAM: Segmenting 3D radiological images with a 2D promptable model
- Authors: Julien Khlaut, Elodie Ferreres, Daniel Tordjman, Hélène Philippe, Tom Boeken, Pierre Manceron, Corentin Dancette,
- Abstract summary: We propose RadSAM, a novel method for segmenting 3D objects with a 2D model from a single prompt.<n>We introduce a benchmark to evaluate the model's ability to segment 3D objects in CT images from a single prompt.
- Score: 4.9000940389224885
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Medical image segmentation is a crucial and time-consuming task in clinical care, where mask precision is extremely important. The Segment Anything Model (SAM) offers a promising approach, as it provides an interactive interface based on visual prompting and edition to refine an initial segmentation. This model has strong generalization capabilities, does not rely on predefined classes, and adapts to diverse objects; however, it is pre-trained on natural images and lacks the ability to process medical data effectively. In addition, this model is built for 2D images, whereas a whole medical domain is based on 3D images, such as CT and MRI. Recent adaptations of SAM for medical imaging are based on 2D models, thus requiring one prompt per slice to segment 3D objects, making the segmentation process tedious. They also lack important features such as editing. To bridge this gap, we propose RadSAM, a novel method for segmenting 3D objects with a 2D model from a single prompt. In practice, we train a 2D model using noisy masks as initial prompts, in addition to bounding boxes and points. We then use this novel prompt type with an iterative inference pipeline to reconstruct the 3D mask slice-by-slice. We introduce a benchmark to evaluate the model's ability to segment 3D objects in CT images from a single prompt and evaluate the models' out-of-domain transfer and edition capabilities. We demonstrate the effectiveness of our approach against state-of-the-art models on this benchmark using the AMOS abdominal organ segmentation dataset.
Related papers
- Common3D: Self-Supervised Learning of 3D Morphable Models for Common Objects in Neural Feature Space [58.623106094568776]
3D morphable models (3DMMs) are a powerful tool to represent the possible shapes and appearances of an object category.
We introduce a new method, Common3D, that learns 3DMMs of common objects in a fully self-supervised manner from a collection of object-centric videos.
Common3D is the first completely self-supervised method that can solve various vision tasks in a zero-shot manner.
arXiv Detail & Related papers (2025-04-30T15:42:23Z) - Medical SAM 2: Segment medical images as video via Segment Anything Model 2 [17.469217682817586]
We introduce Medical SAM 2 (MedSAM-2), a generalized auto-tracking model for universal 2D and 3D medical image segmentation.
We evaluate MedSAM-2 on five 2D tasks and nine 3D tasks, including white blood cells, optic cups, retinal vessels, mandibles, coronary arteries, kidney tumors, liver tumors, breast cancer, nasopharynx cancer, vestibular schwan, mediastinal lymph nodules, cerebral artery, inferior alveolar nerve, and abdominal organs.
arXiv Detail & Related papers (2024-08-01T18:49:45Z) - VISTA3D: A Unified Segmentation Foundation Model For 3D Medical Imaging [18.111368889931885]
We present VISTA3D, Versatile Imaging SegmenTation and voxel model.
It is built on top of the well-established 3D segmentation pipeline.
It is the first model to achieve state-of-the-art performance in both 3D automatic (supporting 127 classes) and 3D interactive segmentation.
arXiv Detail & Related papers (2024-06-07T22:41:39Z) - SAM3D: Zero-Shot Semi-Automatic Segmentation in 3D Medical Images with the Segment Anything Model [3.2554912675000818]
We introduce SAM3D, a new approach to semi-automatic zero-shot segmentation of 3D images building on the existing Segment Anything Model.
We achieve fast and accurate segmentations in 3D images with a four-step strategy involving: user prompting with 3D polylines, volume slicing along multiple axes, slice-wide inference with a pretrained model, and recomposition and refinement in 3D.
arXiv Detail & Related papers (2024-05-10T19:26:17Z) - ToNNO: Tomographic Reconstruction of a Neural Network's Output for Weakly Supervised Segmentation of 3D Medical Images [6.035125735474387]
ToNNO is based on the Tomographic reconstruction of a Neural Network's Output.
It extracts stacks of slices with different angles from the input 3D volume, feeds these slices to a 2D encoder, and applies the inverse Radon transform in order to reconstruct a 3D heatmap of the encoder's predictions.
We apply it to weakly supervised medical image segmentation by training the 2D encoder to output high values for slices containing the regions of interest.
arXiv Detail & Related papers (2024-04-19T11:27:56Z) - 3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features [70.50665869806188]
3DiffTection is a state-of-the-art method for 3D object detection from single images.
We fine-tune a diffusion model to perform novel view synthesis conditioned on a single image.
We further train the model on target data with detection supervision.
arXiv Detail & Related papers (2023-11-07T23:46:41Z) - Leveraging Large-Scale Pretrained Vision Foundation Models for
Label-Efficient 3D Point Cloud Segmentation [67.07112533415116]
We present a novel framework that adapts various foundational models for the 3D point cloud segmentation task.
Our approach involves making initial predictions of 2D semantic masks using different large vision models.
To generate robust 3D semantic pseudo labels, we introduce a semantic label fusion strategy that effectively combines all the results via voting.
arXiv Detail & Related papers (2023-11-03T15:41:15Z) - Promise:Prompt-driven 3D Medical Image Segmentation Using Pretrained
Image Foundation Models [13.08275555017179]
We propose ProMISe, a prompt-driven 3D medical image segmentation model using only a single point prompt.
We evaluate our model on two public datasets for colon and pancreas tumor segmentations.
arXiv Detail & Related papers (2023-10-30T16:49:03Z) - MA-SAM: Modality-agnostic SAM Adaptation for 3D Medical Image
Segmentation [58.53672866662472]
We introduce a modality-agnostic SAM adaptation framework, named as MA-SAM.
Our method roots in the parameter-efficient fine-tuning strategy to update only a small portion of weight increments.
By injecting a series of 3D adapters into the transformer blocks of the image encoder, our method enables the pre-trained 2D backbone to extract third-dimensional information from input data.
arXiv Detail & Related papers (2023-09-16T02:41:53Z) - Interpretable 2D Vision Models for 3D Medical Images [47.75089895500738]
This study proposes a simple approach of adapting 2D networks with an intermediate feature representation for processing 3D images.
We show on all 3D MedMNIST datasets as benchmark and two real-world datasets consisting of several hundred high-resolution CT or MRI scans that our approach performs on par with existing methods.
arXiv Detail & Related papers (2023-07-13T08:27:09Z) - Segment Anything in 3D with Radiance Fields [83.14130158502493]
This paper generalizes the Segment Anything Model (SAM) to segment 3D objects.
We refer to the proposed solution as SA3D, short for Segment Anything in 3D.
We show in experiments that SA3D adapts to various scenes and achieves 3D segmentation within seconds.
arXiv Detail & Related papers (2023-04-24T17:57:15Z) - Weakly Supervised Volumetric Image Segmentation with Deformed Templates [80.04326168716493]
We propose an approach that is truly weakly-supervised in the sense that we only need to provide a sparse set of 3D point on the surface of target objects.
We will show that it outperforms a more traditional approach to weak-supervision in 3D at a reduced supervision cost.
arXiv Detail & Related papers (2021-06-07T22:09:34Z) - Spatial Context-Aware Self-Attention Model For Multi-Organ Segmentation [18.76436457395804]
Multi-organ segmentation is one of most successful applications of deep learning in medical image analysis.
Deep convolutional neural nets (CNNs) have shown great promise in achieving clinically applicable image segmentation performance on CT or MRI images.
We propose a new framework for combining 3D and 2D models, in which the segmentation is realized through high-resolution 2D convolutions.
arXiv Detail & Related papers (2020-12-16T21:39:53Z) - PerMO: Perceiving More at Once from a Single Image for Autonomous
Driving [76.35684439949094]
We present a novel approach to detect, segment, and reconstruct complete textured 3D models of vehicles from a single image.
Our approach combines the strengths of deep learning and the elegance of traditional techniques.
We have integrated these algorithms with an autonomous driving system.
arXiv Detail & Related papers (2020-07-16T05:02:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.