Related papers: Large-Scale Label Quality Assessment for Medical Segmentation via a Vision-Language Judge and Synthetic Data

Large-Scale Label Quality Assessment for Medical Segmentation via a Vision-Language Judge and Synthetic Data

URL: http://arxiv.org/abs/2601.14406v1
Date: Tue, 20 Jan 2026 19:09:12 GMT
Title: Large-Scale Label Quality Assessment for Medical Segmentation via a Vision-Language Judge and Synthetic Data
Authors: Yixiong Chen, Zongwei Zhou, Wenxuan Li, Alan Yuille,
Abstract summary: We propose SegAE, a lightweight vision-supervised model (VLM) that automatically predicts label quality across 142 anatomical structures.<n> trained on over four million image-label pairs with quality scores, SegAE achieves a high correlation coefficient of 0.902 with ground-truth Dice similarity.<n>SegAE improves data efficiency and training performance in active and semi-language learning, reducing dataset annotation cost by one-third and quality-checking time by 70% per label.
Score: 19.936361201674593
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large-scale medical segmentation datasets often combine manual and pseudo-labels of uneven quality, which can compromise training and evaluation. Low-quality labels may hamper performance and make the model training less robust. To address this issue, we propose SegAE (Segmentation Assessment Engine), a lightweight vision-language model (VLM) that automatically predicts label quality across 142 anatomical structures. Trained on over four million image-label pairs with quality scores, SegAE achieves a high correlation coefficient of 0.902 with ground-truth Dice similarity and evaluates a 3D mask in 0.06s. SegAE shows several practical benefits: (I) Our analysis reveals widespread low-quality labeling across public datasets; (II) SegAE improves data efficiency and training performance in active and semi-supervised learning, reducing dataset annotation cost by one-third and quality-checking time by 70% per label. This tool provides a simple and effective solution for quality control in large-scale medical segmentation datasets. The dataset, model weights, and codes are released at https://github.com/Schuture/SegAE.

Related papers

Good Enough: Is it Worth Improving your Label Quality? [66.74591380455261]
Higher-quality labels improve in-domain performance, but gains remain unclear if below a small threshold.<n>For pre-training, label quality has minimal impact, suggesting that models rather transfer general concepts than detailed annotations.
arXiv Detail & Related papers (2025-05-27T09:18:24Z)
Scribbles for All: Benchmarking Scribble Supervised Segmentation Across Datasets [51.74296438621836]
We introduce Scribbles for All, a label and training data generation algorithm for semantic segmentation trained on scribble labels. The main limitation of scribbles as source for weak supervision is the lack of challenging datasets for scribble segmentation. Scribbles for All provides scribble labels for several popular segmentation datasets and provides an algorithm to automatically generate scribble labels for any dataset with dense annotations.
arXiv Detail & Related papers (2024-08-22T15:29:08Z)
GuidedNet: Semi-Supervised Multi-Organ Segmentation via Labeled Data Guide Unlabeled Data [4.775846640214768]
Semi-supervised multi-organ medical image segmentation aids physicians in improving disease diagnosis and treatment planning. A key concept is that voxel features from labeled and unlabeled data close each other in the feature space more likely to belong to the same class. We introduce a Knowledge Transfer Cross Pseudo-label Supervision (KT-CPS) strategy, which leverages the prior knowledge obtained from the labeled data to guide the training of the unlabeled data.
arXiv Detail & Related papers (2024-08-09T07:46:01Z)
Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning [81.83013974171364]
Semi-supervised multi-label learning (SSMLL) is a powerful framework for leveraging unlabeled data to reduce the expensive cost of collecting precise multi-label annotations.<n>Unlike semi-supervised learning, one cannot select the most probable label as the pseudo-label in SSMLL due to multiple semantics contained in an instance.<n>We propose a dual-perspective method to generate high-quality pseudo-labels.
arXiv Detail & Related papers (2024-07-26T09:33:53Z)
Quality Sentinel: Estimating Label Quality and Errors in Medical Segmentation Datasets [11.134987228105162]
We introduce a regression model, Quality Sentinel, to estimate label quality compared with manual annotations in medical segmentation datasets. This regression model was trained on over 4 million image-label pairs created by us. Our Quality Sentinel can predict the label quality of 142 body structures.
arXiv Detail & Related papers (2024-06-01T07:03:15Z)
Leveraging Human-Machine Interactions for Computer Vision Dataset Quality Enhancement [0.0]
Large-scale datasets for single-label multi-class classification, such as emphImageNet-1k, have been instrumental in advancing deep learning and computer vision. We introduce a lightweight, user-friendly, and scalable framework that synergizes human and machine intelligence for efficient dataset validation and quality enhancement. By using Multilabelfy on the ImageNetV2 dataset, we found that approximately $47.88%$ of the images contained at least two labels.
arXiv Detail & Related papers (2024-01-31T10:57:07Z)
Pseudo Label-Guided Data Fusion and Output Consistency for Semi-Supervised Medical Image Segmentation [9.93871075239635]
We propose the PLGDF framework, which builds upon the mean teacher network for segmenting medical images with less annotation. We propose a novel pseudo-label utilization scheme, which combines labeled and unlabeled data to augment the dataset effectively. Our framework yields superior performance compared to six state-of-the-art semi-supervised learning methods.
arXiv Detail & Related papers (2023-11-17T06:36:43Z)
COSST: Multi-organ Segmentation with Partially Labeled Datasets Using Comprehensive Supervisions and Self-training [15.639976408273784]
Deep learning models have demonstrated remarkable success in multi-organ segmentation but typically require large-scale datasets with all organs of interest annotated. It is crucial to investigate how to learn a unified model on the available partially labeled datasets to leverage their synergistic potential. We propose a novel two-stage framework termed COSST, which effectively and efficiently integrates comprehensive supervision signals with self-training.
arXiv Detail & Related papers (2023-04-27T08:55:34Z)
Scaling up Multi-domain Semantic Segmentation with Sentence Embeddings [81.09026586111811]
We propose an approach to semantic segmentation that achieves state-of-the-art supervised performance when applied in a zero-shot setting. This is achieved by replacing each class label with a vector-valued embedding of a short paragraph that describes the class. The resulting merged semantic segmentation dataset of over 2 Million images enables training a model that achieves performance equal to that of state-of-the-art supervised methods on 7 benchmark datasets.
arXiv Detail & Related papers (2022-02-04T07:19:09Z)
Label-Assemble: Leveraging Multiple Datasets with Partial Labels [68.46767639240564]
"Label-Assemble" aims to unleash the full potential of partial labels from an assembly of public datasets. We discovered that learning from negative examples facilitates both computer-aided disease diagnosis and detection.
arXiv Detail & Related papers (2021-09-25T02:48:17Z)
ATSO: Asynchronous Teacher-Student Optimization for Semi-Supervised Medical Image Segmentation [99.90263375737362]
We propose ATSO, an asynchronous version of teacher-student optimization. ATSO partitions the unlabeled data into two subsets and alternately uses one subset to fine-tune the model and updates the label on the other subset. We evaluate ATSO on two popular medical image segmentation datasets and show its superior performance in various semi-supervised settings.
arXiv Detail & Related papers (2020-06-24T04:05:12Z)
3D medical image segmentation with labeled and unlabeled data using autoencoders at the example of liver segmentation in CT images [58.720142291102135]
This work investigates the potential of autoencoder-extracted features to improve segmentation with a convolutional neural network. A convolutional autoencoder was used to extract features from unlabeled data and a multi-scale, fully convolutional CNN was used to perform the target task of 3D liver segmentation in CT images.
arXiv Detail & Related papers (2020-03-17T20:20:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.