A-Eval: A Benchmark for Cross-Dataset Evaluation of Abdominal
Multi-Organ Segmentation
- URL: http://arxiv.org/abs/2309.03906v1
- Date: Thu, 7 Sep 2023 17:59:50 GMT
- Title: A-Eval: A Benchmark for Cross-Dataset Evaluation of Abdominal
Multi-Organ Segmentation
- Authors: Ziyan Huang and Zhongying Deng and Jin Ye and Haoyu Wang and Yanzhou
Su and Tianbin Li and Hui Sun and Junlong Cheng and Jianpin Chen and Junjun
He and Yun Gu and Shaoting Zhang and Lixu Gu and Yu Qiao
- Abstract summary: We introduce A-Eval, a benchmark for the cross-dataset Evaluation ('Eval') of Abdominal ('A') multi-organ segmentation.
We employ training sets from four large-scale public datasets: FLARE22, AMOS, WORD, and TotalSegmentator.
We evaluate the generalizability of various models using the A-Eval benchmark, with a focus on diverse data usage scenarios.
- Score: 38.644744669074775
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Although deep learning have revolutionized abdominal multi-organ
segmentation, models often struggle with generalization due to training on
small, specific datasets. With the recent emergence of large-scale datasets,
some important questions arise: \textbf{Can models trained on these datasets
generalize well on different ones? If yes/no, how to further improve their
generalizability?} To address these questions, we introduce A-Eval, a benchmark
for the cross-dataset Evaluation ('Eval') of Abdominal ('A') multi-organ
segmentation. We employ training sets from four large-scale public datasets:
FLARE22, AMOS, WORD, and TotalSegmentator, each providing extensive labels for
abdominal multi-organ segmentation. For evaluation, we incorporate the
validation sets from these datasets along with the training set from the BTCV
dataset, forming a robust benchmark comprising five distinct datasets. We
evaluate the generalizability of various models using the A-Eval benchmark,
with a focus on diverse data usage scenarios: training on individual datasets
independently, utilizing unlabeled data via pseudo-labeling, mixing different
modalities, and joint training across all available datasets. Additionally, we
explore the impact of model sizes on cross-dataset generalizability. Through
these analyses, we underline the importance of effective data usage in
enhancing models' generalization capabilities, offering valuable insights for
assembling large-scale datasets and improving training strategies. The code and
pre-trained models are available at
\href{https://github.com/uni-medical/A-Eval}{https://github.com/uni-medical/A-Eval}.
Related papers
- A CLIP-Powered Framework for Robust and Generalizable Data Selection [51.46695086779598]
Real-world datasets often contain redundant and noisy data, imposing a negative impact on training efficiency and model performance.
Data selection has shown promise in identifying the most representative samples from the entire dataset.
We propose a novel CLIP-powered data selection framework that leverages multimodal information for more robust and generalizable sample selection.
arXiv Detail & Related papers (2024-10-15T03:00:58Z) - Adapt-$\infty$: Scalable Lifelong Multimodal Instruction Tuning via Dynamic Data Selection [89.42023974249122]
Adapt-$infty$ is a new multi-way and adaptive data selection approach for Lifelong Instruction Tuning.
We construct pseudo-skill clusters by grouping gradient-based sample vectors.
We select the best-performing data selector for each skill cluster from a pool of selector experts.
arXiv Detail & Related papers (2024-10-14T15:48:09Z) - BEVal: A Cross-dataset Evaluation Study of BEV Segmentation Models for Autonomous Driving [3.4113606473878386]
We conduct a comprehensive cross-dataset evaluation of state-of-the-art BEV segmentation models.
We investigate the influence of different sensors, such as cameras and LiDAR, on the models' ability to generalize.
arXiv Detail & Related papers (2024-08-29T07:49:31Z) - Automated Label Unification for Multi-Dataset Semantic Segmentation with GNNs [48.406728896785296]
We propose a novel approach to automatically construct a unified label space across multiple datasets using graph neural networks.
Unlike existing methods, our approach facilitates seamless training without the need for additional manual reannotation or taxonomy reconciliation.
arXiv Detail & Related papers (2024-07-15T08:42:10Z) - Learning Semantic Segmentation from Multiple Datasets with Label Shifts [101.24334184653355]
This paper proposes UniSeg, an effective approach to automatically train models across multiple datasets with differing label spaces.
Specifically, we propose two losses that account for conflicting and co-occurring labels to achieve better generalization performance in unseen domains.
arXiv Detail & Related papers (2022-02-28T18:55:19Z) - MSeg: A Composite Dataset for Multi-domain Semantic Segmentation [100.17755160696939]
We present MSeg, a composite dataset that unifies semantic segmentation datasets from different domains.
We reconcile the generalization and bring the pixel-level annotations into alignment by relabeling more than 220,000 object masks in more than 80,000 images.
A model trained on MSeg ranks first on the WildDash-v1 leaderboard for robust semantic segmentation, with no exposure to WildDash data during training.
arXiv Detail & Related papers (2021-12-27T16:16:35Z) - Cross-Dataset Collaborative Learning for Semantic Segmentation [17.55660581677053]
We present a simple, flexible, and general method for semantic segmentation, termed Cross-Dataset Collaborative Learning (CDCL)
Given multiple labeled datasets, we aim to improve the generalization and discrimination of feature representations on each dataset.
We conduct extensive evaluations on four diverse datasets, i.e., Cityscapes, BDD100K, CamVid, and COCO Stuff, with single-dataset and cross-dataset settings.
arXiv Detail & Related papers (2021-03-21T09:59:47Z) - Simple multi-dataset detection [83.9604523643406]
We present a simple method for training a unified detector on multiple large-scale datasets.
We show how to automatically integrate dataset-specific outputs into a common semantic taxonomy.
Our approach does not require manual taxonomy reconciliation.
arXiv Detail & Related papers (2021-02-25T18:55:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.