All-in-one platform for AI R&D in medical imaging, encompassing data
collection, selection, annotation, and pre-processing
- URL: http://arxiv.org/abs/2403.06145v1
- Date: Sun, 10 Mar 2024 09:24:53 GMT
- Title: All-in-one platform for AI R&D in medical imaging, encompassing data
collection, selection, annotation, and pre-processing
- Authors: Changhee Han, Kyohei Shibano, Wataru Ozaki, Keishiro Osaki, Takafumi
Haraguchi, Daisuke Hirahara, Shumon Kimura, Yasuyuki Kobayashi, Gento Mogi
- Abstract summary: Deep Learning is advancing medical imaging Research and Development (R&D), leading to the frequent clinical use of Artificial Intelligence/Machine Learning (AI/ML)-based medical devices.
However, to advance AI R&D, two challenges arise: 1) significant data imbalance, with most data from Europe/America and under 10% from Asia, despite its 60% global population share; and 2) hefty time and investment needed to curate datasets for commercial use.
In response, we established the first commercial medical imaging platform, encompassing steps like: 1) data collection, 2) data selection, 3) annotation, and 4) pre-processing.
- Score: 0.6291643559814802
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Learning is advancing medical imaging Research and Development (R&D),
leading to the frequent clinical use of Artificial Intelligence/Machine
Learning (AI/ML)-based medical devices. However, to advance AI R&D, two
challenges arise: 1) significant data imbalance, with most data from
Europe/America and under 10% from Asia, despite its 60% global population
share; and 2) hefty time and investment needed to curate proprietary datasets
for commercial use. In response, we established the first commercial medical
imaging platform, encompassing steps like: 1) data collection, 2) data
selection, 3) annotation, and 4) pre-processing. Moreover, we focus on
harnessing under-represented data from Japan and broader Asia, including
Computed Tomography, Magnetic Resonance Imaging, and Whole Slide Imaging scans.
Using the collected data, we are preparing/providing ready-to-use datasets for
medical AI R&D by 1) offering these datasets to AI firms, biopharma, and
medical device makers and 2) using them as training/test data to develop
tailored AI solutions for such entities. We also aim to merge Blockchain for
data security and plan to synthesize rare disease data via generative AI.
DataHub Website: https://medical-datahub.ai/
Related papers
- Embracing Massive Medical Data [8.458637345001758]
We propose an online learning method that enables training AI from massive medical data.
Our method identifies the most significant samples for the current AI model based on their data uniqueness and prediction uncertainty.
Compared with prevalent training paradigms, our method not only improves data efficiency by enabling training on continual data streams, but also mitigates catastrophic forgetting.
arXiv Detail & Related papers (2024-07-05T17:50:30Z) - On Responsible Machine Learning Datasets with Fairness, Privacy, and Regulatory Norms [56.119374302685934]
There have been severe concerns over the trustworthiness of AI technologies.
Machine and deep learning algorithms depend heavily on the data used during their development.
We propose a framework to evaluate the datasets through a responsible rubric.
arXiv Detail & Related papers (2023-10-24T14:01:53Z) - A Revolution of Personalized Healthcare: Enabling Human Digital Twin
with Mobile AIGC [54.74071593520785]
Mobile AIGC can be a key enabling technology for an emerging application, called human digital twin (HDT)
HDT empowered by the mobile AIGC is expected to revolutionize the personalized healthcare by generating rare disease data, modeling high-fidelity digital twin, building versatile testbeds, and providing 24/7 customized medical services.
arXiv Detail & Related papers (2023-07-22T15:59:03Z) - Data-centric Artificial Intelligence: A Survey [47.24049907785989]
Recently, the role of data in AI has been significantly magnified, giving rise to the emerging concept of data-centric AI.
In this survey, we discuss the necessity of data-centric AI, followed by a holistic view of three general data-centric goals.
We believe this is the first comprehensive survey that provides a global view of a spectrum of tasks across various stages of the data lifecycle.
arXiv Detail & Related papers (2023-03-17T17:44:56Z) - Data-centric AI: Perspectives and Challenges [51.70828802140165]
Data-centric AI (DCAI) advocates a fundamental shift from model advancements to ensuring data quality and reliability.
We bring together three general missions: training data development, inference data development, and data maintenance.
arXiv Detail & Related papers (2023-01-12T05:28:59Z) - Non-Imaging Medical Data Synthesis for Trustworthy AI: A Comprehensive
Survey [6.277848092408045]
Data quality is the key factor for the development of trustworthy AI in healthcare.
Access to good quality datasets is limited by the technical difficulty of data acquisition.
Large-scale sharing of healthcare data is hindered by strict ethical restrictions.
arXiv Detail & Related papers (2022-09-17T13:34:17Z) - Robust and Efficient Medical Imaging with Self-Supervision [80.62711706785834]
We present REMEDIS, a unified representation learning strategy to improve robustness and data-efficiency of medical imaging AI.
We study a diverse range of medical imaging tasks and simulate three realistic application scenarios using retrospective data.
arXiv Detail & Related papers (2022-05-19T17:34:18Z) - 2021 BEETL Competition: Advancing Transfer Learning for Subject
Independence & Heterogenous EEG Data Sets [89.84774119537087]
We design two transfer learning challenges around diagnostics and Brain-Computer-Interfacing (BCI)
Task 1 is centred on medical diagnostics, addressing automatic sleep stage annotation across subjects.
Task 2 is centred on Brain-Computer Interfacing (BCI), addressing motor imagery decoding across both subjects and data sets.
arXiv Detail & Related papers (2022-02-14T12:12:20Z) - A Methodology for a Scalable, Collaborative, and Resource-Efficient
Platform to Facilitate Healthcare AI Research [0.0]
We present a system to accelerate data acquisition, dataset development and analysis, and AI model development.
This system can ingest 15,000 patient records per hour, where each record represents thousands of measurements, text notes, and high resolution data.
arXiv Detail & Related papers (2021-12-13T18:39:10Z) - A Systematic Collection of Medical Image Datasets for Deep Learning [37.476768951211206]
Deep learning algorithms are data-dependent and require large datasets for training.
The lack of data in the medical imaging field creates a bottleneck for the application of deep learning to medical image analysis.
This paper provides a collection of medical image datasets with their associated challenges for deep learning research.
arXiv Detail & Related papers (2021-06-24T10:00:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.