Fugu-MT 論文翻訳(概要): Project Imaging-X: A Survey of 1000+ Open-Access Medical Imaging Datasets for Foundation Model Development

論文の概要: Project Imaging-X: A Survey of 1000+ Open-Access Medical Imaging Datasets for Foundation Model Development

arxiv url: http://arxiv.org/abs/2603.27460v1
Date: Sun, 29 Mar 2026 00:46:53 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-31 23:18:44.970921
Title: Project Imaging-X: A Survey of 1000+ Open-Access Medical Imaging Datasets for Foundation Model Development
Title（参考訳）: Project Imaging-X:基礎モデル開発のための1000以上の医療画像データセットの調査
Authors: Zhongying Deng, Cheng Tang, Ziyan Huang, Jiashi Lin, Ying Chen, Junzhi Ning, Chenglong Ma, Jiyao Liu, Wei Li, Yinghao Zhu, Shujian Gao, Yanyan Huang, Sibo Ju, Yanzhou Su, Pengcheng Chen, Wenhao Tang, Tianbin Li, Haoyu Wang, Yuanfeng Ji, Hui Sun, Shaobo Min, Liang Peng, Feilong Tang, Haochen Xue, Rulin Zhou, Chaoyang Zhang, Wenjie Li, Shaohao Rui, Weijie Ma, Xingyue Zhao, Yibin Wang, Kun Yuan, Zhaohui Lu, Shujun Wang, Jinjie Wei, Lihao Liu, Dingkang Yang, Lin Wang, Yulong Li, Haolin Yang, Yiqing Shen, Lequan Yu, Xiaowei Hu, Yun Gu, Yicheng Wu, Benyou Wang, Minghui Zhang, Angelica I. Aviles-Rivero, Qi Gao, Hongming Shan, Xiaoyu Ren, Fang Yan, Hongyu Zhou, Haodong Duan, Maosong Cao, Shanshan Wang, Bin Fu, Xiaomeng Li, Zhi Hou, Chunfeng Song, Lei Bai, Yuan Cheng, Yuandong Pu, Xiang Li, Wenhai Wang, Hao Chen, Jiaxin Zhuang, Songyang Zhang, Huiguang He, Mengzhang Li, Bohan Zhuang, Zhian Bai, Rongshan Yu, Liansheng Wang, Yukun Zhou, Xiaosong Wang, Xin Guo, Guanbin Li, Xiangru Lin, Dakai Jin, Mianxin Liu, Wenlong Zhang, Qi Qin, Conghui He, Yuqiang Li, Ye Luo, Nanqing Dong, Jie Xu, Wenqi Shao, Bo Zhang, Qiujuan Yan, Yihao Liu, Jun Ma, Zhi Lu, Yuewen Cao, Zongwei Zhou, Jianming Liang, Shixiang Tang, Qi Duan, Dongzhan Zhou, Chen Jiang, Yuyin Zhou, Yanwu Xu, Jiancheng Yang, Shaoting Zhang, Xiaohong Liu, Siqi Luo, Yi Xin, Chaoyu Liu, Haochen Wen, Xin Chen, Alejandro Lozano, Min Woo Sun, Yuhui Zhang, Yue Yao, Xiaoxiao Sun, Serena Yeung-Levy, Xia Li, Jing Ke, Chunhui Zhang, Zongyuan Ge, Ming Hu, Jin Ye, Zhifeng Li, Yirong Chen, Yu Qiao, Junjun He,
Abstract要約: 我々は、1,000以上のオープンアクセスデータセットをカバーする、医療画像データセットの現在における最大の調査を提示する。私たちの分析では、範囲が狭いタスクにまたがって断片化され、臓器やモダリティに不均一に分散した、質素なスケールのランドスケープを公開しています。本稿では,メタデータ駆動型融合パラダイム(MDFP)を提案する。
参考スコア（独自算出の注目度）: 314.80153557710616
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Foundation models have demonstrated remarkable success across diverse domains and tasks, primarily due to the thrive of large-scale, diverse, and high-quality datasets. However, in the field of medical imaging, the curation and assembling of such medical datasets are highly challenging due to the reliance on clinical expertise and strict ethical and privacy constraints, resulting in a scarcity of large-scale unified medical datasets and hindering the development of powerful medical foundation models. In this work, we present the largest survey to date of medical image datasets, covering over 1,000 open-access datasets with a systematic catalog of their modalities, tasks, anatomies, annotations, limitations, and potential for integration. Our analysis exposes a landscape that is modest in scale, fragmented across narrowly scoped tasks, and unevenly distributed across organs and modalities, which in turn limits the utility of existing medical image datasets for developing versatile and robust medical foundation models. To turn fragmentation into scale, we propose a metadata-driven fusion paradigm (MDFP) that integrates public datasets with shared modalities or tasks, thereby transforming multiple small data silos into larger, more coherent resources. Building on MDFP, we release an interactive discovery portal that enables end-to-end, automated medical image dataset integration, and compile all surveyed datasets into a unified, structured table that clearly summarizes their key characteristics and provides reference links, offering the community an accessible and comprehensive repository. By charting the current terrain and offering a principled path to dataset consolidation, our survey provides a practical roadmap for scaling medical imaging corpora, supporting faster data discovery, more principled dataset creation, and more capable medical foundation models.
Abstract（参考訳）: ファンデーションモデルは、大規模で多様な、高品質なデータセットの繁栄により、さまざまなドメインやタスクで顕著な成功を収めている。しかし、医用画像の分野では、臨床専門知識と厳格な倫理的・プライバシー的制約に頼っているため、このような医療データセットのキュレーションと組み立ては極めて困難であり、大規模な統合医療データセットの不足と強力な医療基盤モデルの開発を妨げる。本研究では,1,000以上のオープンアクセスデータセットを対象とし,そのモダリティ,タスク,解剖学,アノテーション,制限,統合可能性の体系的なカタログを作成した。我々の分析は、範囲が狭いタスクにまたがって断片化され、臓器やモダリティに均等に分散した、スケールの控えめな風景を公開しており、それによって、汎用的で堅牢な医療基盤モデルを開発するための既存の医療画像データセットの有用性を制限している。断片化を大規模化するために,公開データセットを共有モダリティやタスクに統合し,複数の小さなデータサイロをより大きく一貫性のあるリソースに変換する,メタデータ駆動型融合パラダイム(MDFP)を提案する。 MDFP上に構築されたインタラクティブなディスカバリポータルは、エンドツーエンドで自動化された医療画像データセットの統合を可能にし、すべての調査データセットを統合された構造化テーブルにコンパイルし、その重要な特徴を明確に要約し、参照リンクを提供し、コミュニティにアクセスしやすく包括的なリポジトリを提供する。現在の地形をグラフ化し、データセット統合のための原則化された経路を提供することで、我々の調査は、医療画像コーパスをスケールするための実用的なロードマップを提供し、より高速なデータ発見、より原則化されたデータセット作成、より有能な医療基盤モデルをサポートします。

論文の概要: Project Imaging-X: A Survey of 1000+ Open-Access Medical Imaging Datasets for Foundation Model Development

関連論文リスト