Dataset Distillation for Medical Dataset Sharing
- URL: http://arxiv.org/abs/2209.14603v2
- Date: Fri, 30 Sep 2022 03:50:35 GMT
- Title: Dataset Distillation for Medical Dataset Sharing
- Authors: Guang Li, Ren Togo, Takahiro Ogawa, Miki Haseyama
- Abstract summary: dataset distillation can synthesize a small dataset such that models trained on it achieve comparable performance with the original large dataset.
Experimental results on a COVID-19 chest X-ray image dataset show that our method can achieve high detection performance even using scarce anonymized chest X-ray images.
- Score: 38.65823547986758
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sharing medical datasets between hospitals is challenging because of the
privacy-protection problem and the massive cost of transmitting and storing
many high-resolution medical images. However, dataset distillation can
synthesize a small dataset such that models trained on it achieve comparable
performance with the original large dataset, which shows potential for solving
the existing medical sharing problems. Hence, this paper proposes a novel
dataset distillation-based method for medical dataset sharing. Experimental
results on a COVID-19 chest X-ray image dataset show that our method can
achieve high detection performance even using scarce anonymized chest X-ray
images.
Related papers
- Dataset Distillation in Medical Imaging: A Feasibility Study [16.44272552893816]
Data sharing in the medical image analysis field has potential yet remains underappreciated.
One possible solution is to avoid transferring the entire dataset while still achieving similar model performance.
Recent progress in data distillation within computer science offers promising prospects for sharing medical data efficiently.
arXiv Detail & Related papers (2024-07-19T15:59:04Z) - EMIT-Diff: Enhancing Medical Image Segmentation via Text-Guided
Diffusion Model [4.057796755073023]
We develop controllable diffusion models for medical image synthesis, called EMIT-Diff.
We leverage recent diffusion probabilistic models to generate realistic and diverse synthetic medical image data.
In our approach, we ensure that the synthesized samples adhere to medically relevant constraints.
arXiv Detail & Related papers (2023-10-19T16:18:02Z) - Detecting Shortcuts in Medical Images -- A Case Study in Chest X-rays [0.22940141855172028]
We present a case study on chest X-rays using two publicly available datasets.
We share annotations for a subset of pneumothorax images with drains.
arXiv Detail & Related papers (2022-11-08T14:36:33Z) - Compressed Gastric Image Generation Based on Soft-Label Dataset
Distillation for Medical Data Sharing [38.65823547986758]
Large sizes of medical datasets, the massive amount of memory of saved deep convolutional neural network (DCNN) models, and patients' privacy protection are problems that can lead to inefficient medical data sharing.
This study proposes a novel soft-label dataset distillation method for medical data sharing.
arXiv Detail & Related papers (2022-09-29T08:52:04Z) - Soft-Label Anonymous Gastric X-ray Image Distillation [49.24576562557866]
This paper presents a soft-label anonymous gastric X-ray image distillation method based on a gradient descent approach.
Experimental results show that the proposed method can not only effectively compress the medical dataset but also anonymize medical images to protect the patient's private information.
arXiv Detail & Related papers (2021-04-07T02:04:12Z) - Variational Knowledge Distillation for Disease Classification in Chest
X-Rays [102.04931207504173]
We propose itvariational knowledge distillation (VKD), which is a new probabilistic inference framework for disease classification based on X-rays.
We demonstrate the effectiveness of our method on three public benchmark datasets with paired X-ray images and EHRs.
arXiv Detail & Related papers (2021-03-19T14:13:56Z) - FLOP: Federated Learning on Medical Datasets using Partial Networks [84.54663831520853]
COVID-19 Disease due to the novel coronavirus has caused a shortage of medical resources.
Different data-driven deep learning models have been developed to mitigate the diagnosis of COVID-19.
The data itself is still scarce due to patient privacy concerns.
We propose a simple yet effective algorithm, named textbfFederated textbfL textbfon Medical datasets using textbfPartial Networks (FLOP)
arXiv Detail & Related papers (2021-02-10T01:56:58Z) - Learning Invariant Feature Representation to Improve Generalization
across Chest X-ray Datasets [55.06983249986729]
We show that a deep learning model performing well when tested on the same dataset as training data starts to perform poorly when it is tested on a dataset from a different source.
By employing an adversarial training strategy, we show that a network can be forced to learn a source-invariant representation.
arXiv Detail & Related papers (2020-08-04T07:41:15Z) - Deep Mining External Imperfect Data for Chest X-ray Disease Screening [57.40329813850719]
We argue that incorporating an external CXR dataset leads to imperfect training data, which raises the challenges.
We formulate the multi-label disease classification problem as weighted independent binary tasks according to the categories.
Our framework simultaneously models and tackles the domain and label discrepancies, enabling superior knowledge mining ability.
arXiv Detail & Related papers (2020-06-06T06:48:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.