A-Eval: A Benchmark for Cross-Dataset Evaluation of Abdominal
Multi-Organ Segmentation
- URL: http://arxiv.org/abs/2309.03906v1
- Date: Thu, 7 Sep 2023 17:59:50 GMT
- Title: A-Eval: A Benchmark for Cross-Dataset Evaluation of Abdominal
Multi-Organ Segmentation
- Authors: Ziyan Huang and Zhongying Deng and Jin Ye and Haoyu Wang and Yanzhou
Su and Tianbin Li and Hui Sun and Junlong Cheng and Jianpin Chen and Junjun
He and Yun Gu and Shaoting Zhang and Lixu Gu and Yu Qiao
- Abstract summary: We introduce A-Eval, a benchmark for the cross-dataset Evaluation ('Eval') of Abdominal ('A') multi-organ segmentation.
We employ training sets from four large-scale public datasets: FLARE22, AMOS, WORD, and TotalSegmentator.
We evaluate the generalizability of various models using the A-Eval benchmark, with a focus on diverse data usage scenarios.
- Score: 38.644744669074775
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Although deep learning have revolutionized abdominal multi-organ
segmentation, models often struggle with generalization due to training on
small, specific datasets. With the recent emergence of large-scale datasets,
some important questions arise: \textbf{Can models trained on these datasets
generalize well on different ones? If yes/no, how to further improve their
generalizability?} To address these questions, we introduce A-Eval, a benchmark
for the cross-dataset Evaluation ('Eval') of Abdominal ('A') multi-organ
segmentation. We employ training sets from four large-scale public datasets:
FLARE22, AMOS, WORD, and TotalSegmentator, each providing extensive labels for
abdominal multi-organ segmentation. For evaluation, we incorporate the
validation sets from these datasets along with the training set from the BTCV
dataset, forming a robust benchmark comprising five distinct datasets. We
evaluate the generalizability of various models using the A-Eval benchmark,
with a focus on diverse data usage scenarios: training on individual datasets
independently, utilizing unlabeled data via pseudo-labeling, mixing different
modalities, and joint training across all available datasets. Additionally, we
explore the impact of model sizes on cross-dataset generalizability. Through
these analyses, we underline the importance of effective data usage in
enhancing models' generalization capabilities, offering valuable insights for
assembling large-scale datasets and improving training strategies. The code and
pre-trained models are available at
\href{https://github.com/uni-medical/A-Eval}{https://github.com/uni-medical/A-Eval}.
Related papers
- Automated Label Unification for Multi-Dataset Semantic Segmentation with GNNs [48.406728896785296]
We propose a novel approach to automatically construct a unified label space across multiple datasets using graph neural networks.
Unlike existing methods, our approach facilitates seamless training without the need for additional manual reannotation or taxonomy reconciliation.
arXiv Detail & Related papers (2024-07-15T08:42:10Z) - UniCL: A Universal Contrastive Learning Framework for Large Time Series Models [18.005358506435847]
Time-series analysis plays a pivotal role across a range of critical applications, from finance to healthcare.
Traditional supervised learning methods first annotate extensive labels for time-series data in each task.
This paper introduces UniCL, a universal and scalable contrastive learning framework designed for pretraining time-series foundation models.
arXiv Detail & Related papers (2024-05-17T07:47:11Z) - infoVerse: A Universal Framework for Dataset Characterization with
Multidimensional Meta-information [68.76707843019886]
infoVerse is a universal framework for dataset characterization.
infoVerse captures multidimensional characteristics of datasets by incorporating various model-driven meta-information.
In three real-world applications (data pruning, active learning, and data annotation), the samples chosen on infoVerse space consistently outperform strong baselines.
arXiv Detail & Related papers (2023-05-30T18:12:48Z) - Learning Semantic Segmentation from Multiple Datasets with Label Shifts [101.24334184653355]
This paper proposes UniSeg, an effective approach to automatically train models across multiple datasets with differing label spaces.
Specifically, we propose two losses that account for conflicting and co-occurring labels to achieve better generalization performance in unseen domains.
arXiv Detail & Related papers (2022-02-28T18:55:19Z) - MSeg: A Composite Dataset for Multi-domain Semantic Segmentation [100.17755160696939]
We present MSeg, a composite dataset that unifies semantic segmentation datasets from different domains.
We reconcile the generalization and bring the pixel-level annotations into alignment by relabeling more than 220,000 object masks in more than 80,000 images.
A model trained on MSeg ranks first on the WildDash-v1 leaderboard for robust semantic segmentation, with no exposure to WildDash data during training.
arXiv Detail & Related papers (2021-12-27T16:16:35Z) - Cross-Dataset Collaborative Learning for Semantic Segmentation [17.55660581677053]
We present a simple, flexible, and general method for semantic segmentation, termed Cross-Dataset Collaborative Learning (CDCL)
Given multiple labeled datasets, we aim to improve the generalization and discrimination of feature representations on each dataset.
We conduct extensive evaluations on four diverse datasets, i.e., Cityscapes, BDD100K, CamVid, and COCO Stuff, with single-dataset and cross-dataset settings.
arXiv Detail & Related papers (2021-03-21T09:59:47Z) - Simple multi-dataset detection [83.9604523643406]
We present a simple method for training a unified detector on multiple large-scale datasets.
We show how to automatically integrate dataset-specific outputs into a common semantic taxonomy.
Our approach does not require manual taxonomy reconciliation.
arXiv Detail & Related papers (2021-02-25T18:55:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.