A Systematic Collection of Medical Image Datasets for Deep Learning
- URL: http://arxiv.org/abs/2106.12864v1
- Date: Thu, 24 Jun 2021 10:00:30 GMT
- Title: A Systematic Collection of Medical Image Datasets for Deep Learning
- Authors: Johann Li, Guangming Zhu, Cong Hua, Mingtao Feng, BasheerBennamoun,
Ping Li, Xiaoyuan Lu, Juan Song, Peiyi Shen, Xu Xu, Lin Mei, Liang Zhang,
Syed Afaq Ali Shah, Mohammed Bennamoun
- Abstract summary: Deep learning algorithms are data-dependent and require large datasets for training.
The lack of data in the medical imaging field creates a bottleneck for the application of deep learning to medical image analysis.
This paper provides a collection of medical image datasets with their associated challenges for deep learning research.
- Score: 37.476768951211206
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The astounding success made by artificial intelligence (AI) in healthcare and
other fields proves that AI can achieve human-like performance. However,
success always comes with challenges. Deep learning algorithms are
data-dependent and require large datasets for training. The lack of data in the
medical imaging field creates a bottleneck for the application of deep learning
to medical image analysis. Medical image acquisition, annotation, and analysis
are costly, and their usage is constrained by ethical restrictions. They also
require many resources, such as human expertise and funding. That makes it
difficult for non-medical researchers to have access to useful and large
medical data. Thus, as comprehensive as possible, this paper provides a
collection of medical image datasets with their associated challenges for deep
learning research. We have collected information of around three hundred
datasets and challenges mainly reported between 2013 and 2020 and categorized
them into four categories: head & neck, chest & abdomen, pathology & blood, and
``others''. Our paper has three purposes: 1) to provide a most up to date and
complete list that can be used as a universal reference to easily find the
datasets for clinical image analysis, 2) to guide researchers on the
methodology to test and evaluate their methods' performance and robustness on
relevant datasets, 3) to provide a ``route'' to relevant algorithms for the
relevant medical topics, and challenge leaderboards.
Related papers
- Medical Vision-Language Pre-Training for Brain Abnormalities [96.1408455065347]
We show how to automatically collect medical image-text aligned data for pretraining from public resources such as PubMed.
In particular, we present a pipeline that streamlines the pre-training process by initially collecting a large brain image-text dataset.
We also investigate the unique challenge of mapping subfigures to subcaptions in the medical domain.
arXiv Detail & Related papers (2024-04-27T05:03:42Z) - All-in-one platform for AI R&D in medical imaging, encompassing data
collection, selection, annotation, and pre-processing [0.6291643559814802]
Deep Learning is advancing medical imaging Research and Development (R&D), leading to the frequent clinical use of Artificial Intelligence/Machine Learning (AI/ML)-based medical devices.
However, to advance AI R&D, two challenges arise: 1) significant data imbalance, with most data from Europe/America and under 10% from Asia, despite its 60% global population share; and 2) hefty time and investment needed to curate datasets for commercial use.
In response, we established the first commercial medical imaging platform, encompassing steps like: 1) data collection, 2) data selection, 3) annotation, and 4) pre-processing.
arXiv Detail & Related papers (2024-03-10T09:24:53Z) - Medical Image Retrieval Using Pretrained Embeddings [0.6827423171182154]
We show that medical image retrieval is feasible using pretrained networks without any additional training or fine-tuning steps.
Using pretrained embeddings, we achieved a recall of 1 for various tasks at modality, body region, and organ level.
arXiv Detail & Related papers (2023-11-22T17:42:33Z) - Balancing Privacy and Progress in Artificial Intelligence: Anonymization
in Histopathology for Biomedical Research and Education [1.8078387709049526]
Transferring medical data "as open as possible" poses a risk to patient privacy.
Existing regulations push towards keeping medical data "as closed as necessary" to avoid re-identification risks.
This paper explores the legal regulations and terminologies for medical data-sharing.
arXiv Detail & Related papers (2023-07-18T16:53:07Z) - DeepMediX: A Deep Learning-Driven Resource-Efficient Medical Diagnosis
Across the Spectrum [15.382184404673389]
This work presents textttDeepMediX, a groundbreaking, resource-efficient model that significantly addresses this challenge.
Built on top of the MobileNetV2 architecture, DeepMediX excels in classifying brain MRI scans and skin cancer images.
DeepMediX's design also includes the concept of Federated Learning, enabling a collaborative learning approach without compromising data privacy.
arXiv Detail & Related papers (2023-07-01T12:30:58Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - Understanding the Tricks of Deep Learning in Medical Image Segmentation:
Challenges and Future Directions [66.40971096248946]
In this paper, we collect a series of MedISeg tricks for different model implementation phases.
We experimentally explore the effectiveness of these tricks on consistent baselines.
We also open-sourced a strong MedISeg repository, where each component has the advantage of plug-and-play.
arXiv Detail & Related papers (2022-09-21T12:30:05Z) - When Accuracy Meets Privacy: Two-Stage Federated Transfer Learning
Framework in Classification of Medical Images on Limited Data: A COVID-19
Case Study [77.34726150561087]
COVID-19 pandemic has spread rapidly and caused a shortage of global medical resources.
CNN has been widely utilized and verified in analyzing medical images.
arXiv Detail & Related papers (2022-03-24T02:09:41Z) - Pathological Visual Question Answering [14.816825480418588]
We need to create a visual question answering (VQA) dataset where the AI agent is presented with a pathology image together with a question and is asked to give the correct answer.
Due to privacy concerns, pathology images are usually not publicly available.
It is difficult to hire highly experienced pathologists to create pathology visual questions and answers.
The medical concepts and knowledge covered in pathology question-answer (QA) pairs are very diverse.
arXiv Detail & Related papers (2020-10-06T00:36:55Z) - Opportunities and Challenges of Deep Learning Methods for
Electrocardiogram Data: A Systematic Review [62.490310870300746]
The electrocardiogram (ECG) is one of the most commonly used diagnostic tools in medicine and healthcare.
Deep learning methods have achieved promising results on predictive healthcare tasks using ECG signals.
This paper presents a systematic review of deep learning methods for ECG data from both modeling and application perspectives.
arXiv Detail & Related papers (2019-12-28T02:44:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.