RadioActive: 3D Radiological Interactive Segmentation Benchmark
- URL: http://arxiv.org/abs/2411.07885v3
- Date: Fri, 21 Mar 2025 15:47:12 GMT
- Title: RadioActive: 3D Radiological Interactive Segmentation Benchmark
- Authors: Constantin Ulrich, Tassilo Wald, Emily Tempus, Maximilian Rokuss, Paul F. Jaeger, Klaus Maier-Hein,
- Abstract summary: Recent interactive segmentation models, inspired by METAs Segment Anything, have made significant progress but face critical limitations in 3D.<n>The RadioActive benchmark addresses these challenges by providing a rigorous and reproducible evaluation framework.<n>Surprisingly, SAM2 outperforms all specialized medical 2D and 3D models in a setting requiring only a few interactions to generate prompts for a 3D volume.
- Score: 1.1095764130645482
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Effortless and precise segmentation with minimal clinician effort could greatly streamline clinical workflows. Recent interactive segmentation models, inspired by METAs Segment Anything, have made significant progress but face critical limitations in 3D radiology. These include impractical human interaction requirements such as slice-by-slice operations for 2D models on 3D data and a lack of iterative refinement. Prior studies have been hindered by inadequate evaluation protocols, resulting in unreliable performance assessments and inconsistent findings across studies. The RadioActive benchmark addresses these challenges by providing a rigorous and reproducible evaluation framework for interactive segmentation methods in clinically relevant scenarios. It features diverse datasets, a wide range of target structures, and the most impactful 2D and 3D interactive segmentation methods, all within a flexible and extensible codebase. We also introduce advanced prompting techniques that reduce interaction steps, enabling fair comparisons between 2D and 3D models. Surprisingly, SAM2 outperforms all specialized medical 2D and 3D models in a setting requiring only a few interactions to generate prompts for a 3D volume. This challenges prevailing assumptions and demonstrates that general-purpose models surpass specialized medical approaches. By open-sourcing RadioActive, we invite researchers to integrate their models and prompting techniques, ensuring continuous and transparent evaluation of 3D medical interactive models.
Related papers
- FALCON: Few-Shot Adversarial Learning for Cross-Domain Medical Image Segmentation [6.934814982783991]
We propose FALCON, a cross-domain few-shot segmentation framework that achieves high-precision 3D volume segmentation by processing data as 2D slices.<n>FALCON consistently achieves the lowest Hausdorff Distance scores, indicating superior boundary accuracy.<n>Results are achieved with significantly less labeled data, no data augmentation, and substantially lower computational overhead.
arXiv Detail & Related papers (2026-01-04T22:57:49Z) - REACT3D: Recovering Articulations for Interactive Physical 3D Scenes [96.27769519526426]
REACT3D is a framework that converts static 3D scenes into simulation-ready interactive replicas with consistent geometry.<n>We achieve state-of-the-art performance on detection/segmentation and articulation metrics across diverse indoor scenes.
arXiv Detail & Related papers (2025-10-13T12:37:59Z) - ReCoGNet: Recurrent Context-Guided Network for 3D MRI Prostate Segmentation [11.248082139905865]
We propose a hybrid architecture that models MRI sequences as annotated data.<n>Our method uses a deep, preserving pretrained DeepVLab3 backbone to extract high-level semantic features from each MRI slice and a recurrent convolutional head, built with ConvLSTM layers, to integrate information across slices.<n>Compared to state-of-the-art 2D and 3D segmentation models, our approach demonstrates superior performance in terms of precision, recall, Intersection over Union (IoU), Dice Similarity Coefficient (DSC) and robustness.
arXiv Detail & Related papers (2025-06-24T14:56:55Z) - nnInteractive: Redefining 3D Promptable Segmentation [0.461929066711062]
We introduce nnInteractive, the first comprehensive 3D interactive open-set segmentation method.
It supports diverse prompts-including points, scribbles, boxes, and a novel lasso prompt-while leveraging intuitive 2D interactions to generate full 3D segmentations.
nnInteractive sets a new state-of-the-art in accuracy, adaptability, and usability.
arXiv Detail & Related papers (2025-03-11T12:30:34Z) - MG-3D: Multi-Grained Knowledge-Enhanced 3D Medical Vision-Language Pre-training [7.968487067774351]
3D medical image analysis is pivotal in numerous clinical applications.
Large-scale vision-language pre-training remains underexplored in 3D medical image analysis.
We propose MG-3D, pre-trained on large-scale data (47.1K)
arXiv Detail & Related papers (2024-12-08T09:45:59Z) - Cross-D Conv: Cross-Dimensional Transferable Knowledge Base via Fourier Shifting Operation [3.69758875412828]
Cross-D Conv operation bridges the dimensional gap by learning the phase shifting in the Fourier domain.
Our method enables seamless weight transfer between 2D and 3D convolution operations, effectively facilitating cross-dimensional learning.
arXiv Detail & Related papers (2024-11-02T13:03:44Z) - 3D-CT-GPT: Generating 3D Radiology Reports through Integration of Large Vision-Language Models [51.855377054763345]
This paper introduces 3D-CT-GPT, a Visual Question Answering (VQA)-based medical visual language model for generating radiology reports from 3D CT scans.
Experiments on both public and private datasets demonstrate that 3D-CT-GPT significantly outperforms existing methods in terms of report accuracy and quality.
arXiv Detail & Related papers (2024-09-28T12:31:07Z) - Enhanced segmentation of femoral bone metastasis in CT scans of patients using synthetic data generation with 3D diffusion models [0.06700983301090582]
We propose an automated data pipeline using 3D Denoising Diffusion Probabilistic Models (DDPM) to generalize on new images.
We created 5675 new volumes, then trained 3D U-Net segmentation models on real and synthetic data to compare segmentation performance.
arXiv Detail & Related papers (2024-09-17T09:21:19Z) - Towards Synergistic Deep Learning Models for Volumetric Cirrhotic Liver Segmentation in MRIs [1.5228650878164722]
Liver cirrhosis, a leading cause of global mortality, requires precise segmentation of ROIs for effective disease monitoring and treatment planning.
Existing segmentation models often fail to capture complex feature interactions and generalize across diverse datasets.
We propose a novel synergistic theory that leverages complementary latent spaces for enhanced feature interaction modeling.
arXiv Detail & Related papers (2024-08-08T14:41:32Z) - Monocular pose estimation of articulated surgical instruments in open surgery [0.873811641236639]
This work presents a novel approach to monocular 6D pose estimation of surgical instruments in open surgery, addressing challenges such as object articulations, symmetries, and lack of annotated real-world data.
The proposed approach consists of three main components: (1) synthetic data generation using 3D modeling of surgical tools with articulation rigging; (2) a tailored pose estimation framework combining object detection with pose estimation and a hybrid geometric fusion strategy; and (3) a training strategy that utilizes both synthetic and real unannotated data, employing domain adaptation on real video data using automatically generated pseudo-labels.
arXiv Detail & Related papers (2024-07-16T19:47:35Z) - Composable Interventions for Language Models [60.32695044723103]
Test-time interventions for language models can enhance factual accuracy, mitigate harmful outputs, and improve model efficiency without costly retraining.
But despite a flood of new methods, different types of interventions are largely developing independently.
We introduce composable interventions, a framework to study the effects of using multiple interventions on the same language models.
arXiv Detail & Related papers (2024-07-09T01:17:44Z) - Zero123-6D: Zero-shot Novel View Synthesis for RGB Category-level 6D Pose Estimation [66.3814684757376]
This work presents Zero123-6D, the first work to demonstrate the utility of Diffusion Model-based novel-view-synthesizers in enhancing RGB 6D pose estimation at category-level.
The outlined method shows reduction in data requirements, removal of the necessity of depth information in zero-shot category-level 6D pose estimation task, and increased performance, quantitatively demonstrated through experiments on the CO3D dataset.
arXiv Detail & Related papers (2024-03-21T10:38:18Z) - Self-supervised 3D Patient Modeling with Multi-modal Attentive Fusion [32.71972792352939]
3D patient body modeling is critical to the success of automated patient positioning for smart medical scanning and operating rooms.
Existing CNN-based end-to-end patient modeling solutions typically require customized network designs demanding large amount of relevant training data.
We propose a generic modularized 3D patient modeling method consists of (a) a multi-modal keypoint detection module with attentive fusion for 2D patient joint localization.
We demonstrate the efficacy of the proposed method by extensive patient positioning experiments on both public and clinical data.
arXiv Detail & Related papers (2024-03-05T18:58:55Z) - Enhancing Weakly Supervised 3D Medical Image Segmentation through
Probabilistic-aware Learning [52.249748801637196]
3D medical image segmentation is a challenging task with crucial implications for disease diagnosis and treatment planning.
Recent advances in deep learning have significantly enhanced fully supervised medical image segmentation.
We propose a novel probabilistic-aware weakly supervised learning pipeline, specifically designed for 3D medical imaging.
arXiv Detail & Related papers (2024-03-05T00:46:53Z) - S^2Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph Generation in OR [50.435592120607815]
Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence in the operating room (OR)
Previous works have primarily relied on multi-stage learning, where the generated semantic scene graphs depend on intermediate processes with pose estimation and object detection.
In this study, we introduce a novel single-stage bi-modal transformer framework for SGG in the OR, termed S2Former-OR.
arXiv Detail & Related papers (2024-02-22T11:40:49Z) - DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields [68.94868475824575]
This paper introduces a novel approach capable of generating infinite, high-quality 3D-consistent 2D annotations alongside 3D point cloud segmentations.
We leverage the strong semantic prior within a 3D generative model to train a semantic decoder.
Once trained, the decoder efficiently generalizes across the latent space, enabling the generation of infinite data.
arXiv Detail & Related papers (2023-11-18T21:58:28Z) - SynergyNet: Bridging the Gap between Discrete and Continuous
Representations for Precise Medical Image Segmentation [4.562266115935329]
We propose SynergyNet, a novel bottleneck architecture designed to enhance existing encoder-decoder segmentation frameworks.
Our experiment on multi-organ segmentation and cardiac datasets demonstrates that SynergyNet outperforms other state of the art methods.
Our innovative approach paves the way for enhancing the overall performance and capabilities of deep learning models in the critical domain of medical image analysis.
arXiv Detail & Related papers (2023-10-26T20:13:44Z) - Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based
Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage.
We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets.
By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z) - Extending Process Discovery with Model Complexity Optimization and
Cyclic States Identification: Application to Healthcare Processes [62.997667081978825]
The paper presents an approach to process mining providing semi-automatic support to model optimization.
A model simplification approach is proposed, which essentially abstracts the raw model at the desired granularity.
We aim to demonstrate the capabilities of the technological solution using three datasets from different applications in the healthcare domain.
arXiv Detail & Related papers (2022-06-10T16:20:59Z) - LocATe: End-to-end Localization of Actions in 3D with Transformers [91.28982770522329]
LocATe is an end-to-end approach that jointly localizes and recognizes actions in a 3D sequence.
Unlike transformer-based object-detection and classification models which consider image or patch features as input, LocATe's transformer model is capable of capturing long-term correlations between actions in a sequence.
We introduce a new, challenging, and more realistic benchmark dataset, BABEL-TAL-20 (BT20), where the performance of state-of-the-art methods is significantly worse.
arXiv Detail & Related papers (2022-03-21T03:35:32Z) - Bidirectional RNN-based Few Shot Learning for 3D Medical Image
Segmentation [11.873435088539459]
We propose a 3D few shot segmentation framework for accurate organ segmentation using limited training samples of the target organ annotation.
A U-Net like network is designed to predict segmentation by learning the relationship between 2D slices of support data and a query image.
We evaluate our proposed model using three 3D CT datasets with annotations of different organs.
arXiv Detail & Related papers (2020-11-19T01:44:55Z) - Volumetric Medical Image Segmentation: A 3D Deep Coarse-to-fine
Framework and Its Adversarial Examples [74.92488215859991]
We propose a novel 3D-based coarse-to-fine framework to efficiently tackle these challenges.
The proposed 3D-based framework outperforms their 2D counterparts by a large margin since it can leverage the rich spatial information along all three axes.
We conduct experiments on three datasets, the NIH pancreas dataset, the JHMI pancreas dataset and the JHMI pathological cyst dataset.
arXiv Detail & Related papers (2020-10-29T15:39:19Z) - Robust Medical Instrument Segmentation Challenge 2019 [56.148440125599905]
Intraoperative tracking of laparoscopic instruments is often a prerequisite for computer and robotic-assisted interventions.
Our challenge was based on a surgical data set comprising 10,040 annotated images acquired from a total of 30 surgical procedures.
The results confirm the initial hypothesis, namely that algorithm performance degrades with an increasing domain gap.
arXiv Detail & Related papers (2020-03-23T14:35:08Z) - Estimating the Effects of Continuous-valued Interventions using
Generative Adversarial Networks [103.14809802212535]
We build on the generative adversarial networks (GANs) framework to address the problem of estimating the effect of continuous-valued interventions.
Our model, SCIGAN, is flexible and capable of simultaneously estimating counterfactual outcomes for several different continuous interventions.
To address the challenges presented by shifting to continuous interventions, we propose a novel architecture for our discriminator.
arXiv Detail & Related papers (2020-02-27T18:46:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.