Revisiting 3D Medical Scribble Supervision: Benchmarking Beyond Cardiac Segmentation
- URL: http://arxiv.org/abs/2403.12834v2
- Date: Wed, 13 Aug 2025 11:28:51 GMT
- Title: Revisiting 3D Medical Scribble Supervision: Benchmarking Beyond Cardiac Segmentation
- Authors: Karol Gotkowski, Klaus H. Maier-Hein, Fabian Isensee,
- Abstract summary: Scribble supervision has emerged as a promising approach for reducing annotation costs in medical 3D segmentation.<n>This work aims to steer scribble supervision toward more practical, robust, and general methodologies for medical image segmentation.
- Score: 1.2238508261277228
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Scribble supervision has emerged as a promising approach for reducing annotation costs in medical 3D segmentation by leveraging sparse annotations instead of voxel-wise labels. While existing methods report strong performance, a closer analysis reveals that the majority of research is confined to the cardiac domain, predominantly using ACDC and MSCMR datasets. This over-specialization has resulted in severe overfitting, misleading claims of performance improvements, and a lack of generalization across broader segmentation tasks. In this work, we formulate a set of key requirements for practical scribble supervision and introduce ScribbleBench, a comprehensive benchmark spanning over seven diverse medical imaging datasets, to systematically evaluate the fulfillment of these requirements. Consequently, we uncover a general failure of methods to generalize across tasks and that many widely used novelties degrade performance outside of the cardiac domain, whereas simpler overlooked approaches achieve superior generalization. Finally, we raise awareness for a strong yet overlooked baseline, nnU-Net coupled with a partial loss, which consistently outperforms specialized methods across a diverse range of tasks. By identifying fundamental limitations in existing research and establishing a new benchmark-driven evaluation standard, this work aims to steer scribble supervision toward more practical, robust, and generalizable methodologies for medical image segmentation.
Related papers
- Resource-efficient Automatic Refinement of Segmentations via Weak Supervision from Light Feedback [1.8082075562656847]
We present SCORE, a weakly supervised framework that learns to refine mask predictions only using light feedback during training.<n>We demonstrate SCORE on humerus CT scans, where it considerably improves initial predictions and achieves performance on par with existing refinement methods.
arXiv Detail & Related papers (2025-11-04T13:53:10Z) - Rethinking Evaluation of Infrared Small Target Detection [105.59753496831739]
This paper introduces a hybrid-level metric incorporating pixel- and target-level performance, proposing a systematic error analysis method, and emphasizing the importance of cross-dataset evaluation.<n>An open-source toolkit has be released to facilitate standardized benchmarking.
arXiv Detail & Related papers (2025-09-21T02:45:07Z) - RefineSeg: Dual Coarse-to-Fine Learning for Medical Image Segmentation [2.608565452856053]
High-quality pixel-level annotations of medical images are essential for supervised segmentation tasks.<n>We propose a novel coarse-to-fine segmentation framework that relies entirely on coarse-level annotations.
arXiv Detail & Related papers (2025-08-04T19:14:30Z) - BoundarySeg:An Embarrassingly Simple Method To Boost Medical Image Segmentation Performance for Low Data Regimes [2.1387689734506043]
We propose a simple, yet effective and computationally efficient approach for medical image segmentation that leverages only existing annotations.<n>We propose BoundarySeg, a multi-task framework that incorporates organ boundary prediction as an auxiliary task to full organ segmentation.
arXiv Detail & Related papers (2025-05-14T22:15:41Z) - SINDER: Repairing the Singular Defects of DINOv2 [61.98878352956125]
Vision Transformer models trained on large-scale datasets often exhibit artifacts in the patch token they extract.
We propose a novel fine-tuning smooth regularization that rectifies structural deficiencies using only a small dataset.
arXiv Detail & Related papers (2024-07-23T20:34:23Z) - Anti-Collapse Loss for Deep Metric Learning Based on Coding Rate Metric [99.19559537966538]
DML aims to learn a discriminative high-dimensional embedding space for downstream tasks like classification, clustering, and retrieval.
To maintain the structure of embedding space and avoid feature collapse, we propose a novel loss function called Anti-Collapse Loss.
Comprehensive experiments on benchmark datasets demonstrate that our proposed method outperforms existing state-of-the-art methods.
arXiv Detail & Related papers (2024-07-03T13:44:20Z) - Comparative Benchmarking of Failure Detection Methods in Medical Image Segmentation: Unveiling the Role of Confidence Aggregation [0.723226140060364]
This paper introduces a comprehensive benchmarking framework aimed at evaluating failure detection methodologies within medical image segmentation.
We identify the strengths and limitations of current failure detection metrics, advocating for the risk-coverage analysis as a holistic evaluation approach.
arXiv Detail & Related papers (2024-06-05T14:36:33Z) - 3D Medical Image Segmentation with Sparse Annotation via Cross-Teaching
between 3D and 2D Networks [26.29122638813974]
We propose a framework that can robustly learn from sparse annotation using the cross-teaching of both 3D and 2D networks.
Our experimental results on the MMWHS dataset demonstrate that our method outperforms the state-of-the-art (SOTA) semi-supervised segmentation methods.
arXiv Detail & Related papers (2023-07-30T15:26:17Z) - SwIPE: Efficient and Robust Medical Image Segmentation with Implicit Patch Embeddings [12.79344668998054]
We propose SwIPE (Segmentation with Implicit Patch Embeddings) to enable accurate local boundary delineation and global shape coherence.
We show that SwIPE significantly improves over recent implicit approaches and outperforms state-of-the-art discrete methods with over 10x fewer parameters.
arXiv Detail & Related papers (2023-07-23T20:55:11Z) - Extraction of volumetric indices from echocardiography: which deep
learning solution for clinical use? [6.144041824426555]
We show that the proposed 3D nnU-Net outperforms alternative 2D and recurrent segmentation methods.
Overall, the experimental results suggest that with sufficient training data, 3D nnU-Net could become the first automated tool to meet the standards of an everyday clinical device.
arXiv Detail & Related papers (2023-05-03T09:38:52Z) - Rethinking Semi-Supervised Medical Image Segmentation: A
Variance-Reduction Perspective [51.70661197256033]
We propose ARCO, a semi-supervised contrastive learning framework with stratified group theory for medical image segmentation.
We first propose building ARCO through the concept of variance-reduced estimation and show that certain variance-reduction techniques are particularly beneficial in pixel/voxel-level segmentation tasks.
We experimentally validate our approaches on eight benchmarks, i.e., five 2D/3D medical and three semantic segmentation datasets, with different label settings.
arXiv Detail & Related papers (2023-02-03T13:50:25Z) - ZScribbleSeg: Zen and the Art of Scribble Supervised Medical Image
Segmentation [16.188681108101196]
We propose to utilize solely scribble annotations for weakly supervised segmentation.
Existing solutions mainly leverage selective losses computed solely on annotated areas.
We introduce regularization terms to encode the spatial relationship and shape prior.
We integrate the efficient scribble supervision with the prior into a unified framework, denoted as ZScribbleSeg.
arXiv Detail & Related papers (2023-01-12T09:00:40Z) - A Survey on Label-efficient Deep Segmentation: Bridging the Gap between
Weak Supervision and Dense Prediction [115.9169213834476]
This paper offers a comprehensive review on label-efficient segmentation methods.
We first develop a taxonomy to organize these methods according to the supervision provided by different types of weak labels.
Next, we summarize the existing label-efficient segmentation methods from a unified perspective.
arXiv Detail & Related papers (2022-07-04T06:21:01Z) - On Modality Bias Recognition and Reduction [70.69194431713825]
We study the modality bias problem in the context of multi-modal classification.
We propose a plug-and-play loss function method, whereby the feature space for each label is adaptively learned.
Our method yields remarkable performance improvements compared with the baselines.
arXiv Detail & Related papers (2022-02-25T13:47:09Z) - Towards to Robust and Generalized Medical Image Segmentation Framework [17.24628770042803]
We propose a novel two-stage framework for robust generalized segmentation.
In particular, an unsupervised Tile-wise AutoEncoder (T-AE) pretraining architecture is coined to learn meaningful representation.
Experiments of lung segmentation on multi chest X-ray datasets are conducted.
arXiv Detail & Related papers (2021-08-09T05:58:49Z) - Flip Learning: Erase to Segment [65.84901344260277]
Weakly-supervised segmentation (WSS) can help reduce time-consuming and cumbersome manual annotation.
We propose a novel and general WSS framework called Flip Learning, which only needs the box annotation.
Our proposed approach achieves competitive performance and shows great potential to narrow the gap between fully-supervised and weakly-supervised learning.
arXiv Detail & Related papers (2021-08-02T09:56:10Z) - A Positive/Unlabeled Approach for the Segmentation of Medical Sequences
using Point-Wise Supervision [3.883460584034766]
We propose a new method to efficiently segment medical imaging volumes or videos using point-wise annotations only.
Our approach trains a deep learning model using an appropriate Positive/Unlabeled objective function using point-wise annotations.
We show experimentally that our approach outperforms state-of-the-art methods tailored to the same problem.
arXiv Detail & Related papers (2021-07-18T09:13:33Z) - Weakly-Supervised Universal Lesion Segmentation with Regional Level Set
Loss [16.80758525711538]
We present a novel weakly-supervised universal lesion segmentation method based on the High-Resolution Network (HRNet)
AHRNet provides advanced high-resolution deep image features by involving a decoder, dual-attention and scale attention mechanisms.
Our method achieves the best performance on the publicly large-scale DeepLesion dataset and a hold-out test set.
arXiv Detail & Related papers (2021-05-03T23:33:37Z) - Every Annotation Counts: Multi-label Deep Supervision for Medical Image
Segmentation [85.0078917060652]
We propose a semi-weakly supervised segmentation algorithm to overcome this barrier.
Our approach is based on a new formulation of deep supervision and student-teacher model.
With our novel training regime for segmentation that flexibly makes use of images that are either fully labeled, marked with bounding boxes, just global labels, or not at all, we are able to cut the requirement for expensive labels by 94.22%.
arXiv Detail & Related papers (2021-04-27T14:51:19Z) - Towards Robust Partially Supervised Multi-Structure Medical Image
Segmentation on Small-Scale Data [123.03252888189546]
We propose Vicinal Labels Under Uncertainty (VLUU) to bridge the methodological gaps in partially supervised learning (PSL) under data scarcity.
Motivated by multi-task learning and vicinal risk minimization, VLUU transforms the partially supervised problem into a fully supervised problem by generating vicinal labels.
Our research suggests a new research direction in label-efficient deep learning with partial supervision.
arXiv Detail & Related papers (2020-11-28T16:31:00Z) - A Weakly-Supervised Semantic Segmentation Approach based on the Centroid
Loss: Application to Quality Control and Inspection [6.101839518775968]
We propose and assess a new weakly-supervised semantic segmentation approach making use of a novel loss function.
The performance of the approach is evaluated against datasets from two different industry-related case studies.
arXiv Detail & Related papers (2020-10-26T09:08:21Z) - Robust Medical Instrument Segmentation Challenge 2019 [56.148440125599905]
Intraoperative tracking of laparoscopic instruments is often a prerequisite for computer and robotic-assisted interventions.
Our challenge was based on a surgical data set comprising 10,040 annotated images acquired from a total of 30 surgical procedures.
The results confirm the initial hypothesis, namely that algorithm performance degrades with an increasing domain gap.
arXiv Detail & Related papers (2020-03-23T14:35:08Z) - Towards Using Count-level Weak Supervision for Crowd Counting [55.58468947486247]
This paper studies the problem of weakly-supervised crowd counting which learns a model from only a small amount of location-level annotations (fully-supervised) but a large amount of count-level annotations (weakly-supervised)
We devise a simple-yet-effective training strategy, namely Multiple Auxiliary Tasks Training (MATT), to construct regularizes for restricting the freedom of the generated density maps.
arXiv Detail & Related papers (2020-02-29T02:58:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.