Iterative Online Image Synthesis via Diffusion Model for Imbalanced
Classification
- URL: http://arxiv.org/abs/2403.08407v1
- Date: Wed, 13 Mar 2024 10:51:18 GMT
- Title: Iterative Online Image Synthesis via Diffusion Model for Imbalanced
Classification
- Authors: Shuhan Li, Yi Lin, Hao Chen, Kwang-Ting Cheng
- Abstract summary: We introduce an Iterative Online Image Synthesis framework to address the class imbalance problem in medical image classification.
Our framework incorporates two key modules, namely Online Image Synthesis (OIS) and Accuracy Adaptive Sampling (AAS)
To evaluate the effectiveness of our proposed method in addressing imbalanced classification, we conduct experiments on the HAM10000 and APTOS datasets.
- Score: 29.730360798234294
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate and robust classification of diseases is important for proper
diagnosis and treatment. However, medical datasets often face challenges
related to limited sample sizes and inherent imbalanced distributions, due to
difficulties in data collection and variations in disease prevalence across
different types. In this paper, we introduce an Iterative Online Image
Synthesis (IOIS) framework to address the class imbalance problem in medical
image classification. Our framework incorporates two key modules, namely Online
Image Synthesis (OIS) and Accuracy Adaptive Sampling (AAS), which collectively
target the imbalance classification issue at both the instance level and the
class level. The OIS module alleviates the data insufficiency problem by
generating representative samples tailored for online training of the
classifier. On the other hand, the AAS module dynamically balances the
synthesized samples among various classes, targeting those with low training
accuracy. To evaluate the effectiveness of our proposed method in addressing
imbalanced classification, we conduct experiments on the HAM10000 and APTOS
datasets. The results obtained demonstrate the superiority of our approach over
state-of-the-art methods as well as the effectiveness of each component. The
source code will be released upon acceptance.
Related papers
- Debiasing Cardiac Imaging with Controlled Latent Diffusion Models [1.802269171647208]
We propose a method to alleviate imbalances inherent in datasets through the generation of synthetic data.
We adopt ControlNet based on a denoising diffusion probabilistic model to condition on text assembled from patient metadata and cardiac geometry.
Our experiments demonstrate the effectiveness of the proposed approach in mitigating dataset imbalances.
arXiv Detail & Related papers (2024-03-28T15:41:43Z) - Few-shot learning for COVID-19 Chest X-Ray Classification with
Imbalanced Data: An Inter vs. Intra Domain Study [49.5374512525016]
Medical image datasets are essential for training models used in computer-aided diagnosis, treatment planning, and medical research.
Some challenges are associated with these datasets, including variability in data distribution, data scarcity, and transfer learning issues when using models pre-trained from generic images.
We propose a methodology based on Siamese neural networks in which a series of techniques are integrated to mitigate the effects of data scarcity and distribution imbalance.
arXiv Detail & Related papers (2024-01-18T16:59:27Z) - MCRAGE: Synthetic Healthcare Data for Fairness [3.0089659534785853]
We propose Minority Class Rebalancing through Augmentation by Generative modeling (MCRAGE) to augment imbalanced datasets.
MCRAGE involves training a Denoising Diffusion Probabilistic Model (CDDPM) capable of generating high-quality synthetic EHR samples from underrepresented classes.
We use this synthetic data to augment the existing imbalanced dataset, resulting in a more balanced distribution across all classes.
arXiv Detail & Related papers (2023-10-27T19:02:22Z) - Class-Specific Distribution Alignment for Semi-Supervised Medical Image
Classification [14.343079589464994]
Class-Specific Distribution Alignment (CSDA) is a semi-supervised learning framework based on self-training.
We show that our method provides competitive performance on semi-supervised skin disease, thoracic disease, and endoscopic image classification tasks.
arXiv Detail & Related papers (2023-07-29T13:38:19Z) - SPLAL: Similarity-based pseudo-labeling with alignment loss for
semi-supervised medical image classification [11.435826510575879]
Semi-supervised learning (SSL) methods can mitigate challenges by leveraging both labeled and unlabeled data.
SSL methods for medical image classification need to address two key challenges: (1) estimating reliable pseudo-labels for the images in the unlabeled dataset and (2) reducing biases caused by class imbalance.
In this paper, we propose a novel SSL approach, SPLAL, that effectively addresses these challenges.
arXiv Detail & Related papers (2023-07-10T14:53:24Z) - Class-Balancing Diffusion Models [57.38599989220613]
Class-Balancing Diffusion Models (CBDM) are trained with a distribution adjustment regularizer as a solution.
Our method benchmarked the generation results on CIFAR100/CIFAR100LT dataset and shows outstanding performance on the downstream recognition task.
arXiv Detail & Related papers (2023-04-30T20:00:14Z) - Dynamic Bank Learning for Semi-supervised Federated Image Diagnosis with
Class Imbalance [65.61909544178603]
We study a practical yet challenging problem of class imbalanced semi-supervised FL (imFed-Semi)
This imFed-Semi problem is addressed by a novel dynamic bank learning scheme, which improves client training by exploiting class proportion information.
We evaluate our approach on two public real-world medical datasets, including the intracranial hemorrhage diagnosis with 25,000 CT slices and skin lesion diagnosis with 10,015 dermoscopy images.
arXiv Detail & Related papers (2022-06-27T06:51:48Z) - Analyzing the Effects of Handling Data Imbalance on Learned Features
from Medical Images by Looking Into the Models [50.537859423741644]
Training a model on an imbalanced dataset can introduce unique challenges to the learning problem.
We look deeper into the internal units of neural networks to observe how handling data imbalance affects the learned features.
arXiv Detail & Related papers (2022-04-04T09:38:38Z) - Cross-Site Severity Assessment of COVID-19 from CT Images via Domain
Adaptation [64.59521853145368]
Early and accurate severity assessment of Coronavirus disease 2019 (COVID-19) based on computed tomography (CT) images offers a great help to the estimation of intensive care unit event.
To augment the labeled data and improve the generalization ability of the classification model, it is necessary to aggregate data from multiple sites.
This task faces several challenges including class imbalance between mild and severe infections, domain distribution discrepancy between sites, and presence of heterogeneous features.
arXiv Detail & Related papers (2021-09-08T07:56:51Z) - SeismoFlow -- Data augmentation for the class imbalance problem [0.0]
SeismoFlow is a flow-based generative model to create synthetic samples.
Inspired by the Glow model, it uses on the learned latent space to produce synthetic samples for one rare class.
We achieve an improvement of 13.9% on the rare class F1-score.
arXiv Detail & Related papers (2020-07-23T19:48:23Z) - Semi-supervised Medical Image Classification with Relation-driven
Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification.
It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations.
Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.