SeismoFlow -- Data augmentation for the class imbalance problem
- URL: http://arxiv.org/abs/2007.12229v2
- Date: Wed, 2 Sep 2020 14:42:37 GMT
- Title: SeismoFlow -- Data augmentation for the class imbalance problem
- Authors: Ruy Luiz Milidi\'u and Luis Felipe M\"uller
- Abstract summary: SeismoFlow is a flow-based generative model to create synthetic samples.
Inspired by the Glow model, it uses on the learned latent space to produce synthetic samples for one rare class.
We achieve an improvement of 13.9% on the rare class F1-score.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In several application areas, such as medical diagnosis, spam filtering,
fraud detection, and seismic data analysis, it is very usual to find relevant
classification tasks where some class occurrences are rare. This is the so
called class imbalance problem, which is a challenge in machine learning. In
this work, we propose the SeismoFlow a flow-based generative model to create
synthetic samples, aiming to address the class imbalance. Inspired by the Glow
model, it uses interpolation on the learned latent space to produce synthetic
samples for one rare class. We apply our approach to the development of a
seismogram signal quality classifier. We introduce a dataset composed
of5.223seismograms that are distributed between the good, medium, and bad
classes and with their respective frequencies of 66.68%,31.54%, and 1.76%. Our
methodology is evaluated on a stratified 10-fold cross-validation setting,
using the Miniceptionmodel as a baseline, and assessing the effects of adding
the generated samples on the training set of each iteration. In our
experiments, we achieve an improvement of 13.9% on the rare class F1-score,
while not hurting the metric value for the other classes and thus observing the
overall accuracy improvement. Our empirical findings indicate that our method
can generate high-quality synthetic seismograms with realistic looking and
sufficient plurality to help the Miniception model to overcome the class
imbalance problem. We believe that our results are a step forward in solving
both the task of seismogram signal quality classification and class imbalance.
Related papers
- Iterative Online Image Synthesis via Diffusion Model for Imbalanced
Classification [29.730360798234294]
We introduce an Iterative Online Image Synthesis framework to address the class imbalance problem in medical image classification.
Our framework incorporates two key modules, namely Online Image Synthesis (OIS) and Accuracy Adaptive Sampling (AAS)
To evaluate the effectiveness of our proposed method in addressing imbalanced classification, we conduct experiments on the HAM10000 and APTOS datasets.
arXiv Detail & Related papers (2024-03-13T10:51:18Z) - Training Class-Imbalanced Diffusion Model Via Overlap Optimization [55.96820607533968]
Diffusion models trained on real-world datasets often yield inferior fidelity for tail classes.
Deep generative models, including diffusion models, are biased towards classes with abundant training images.
We propose a method based on contrastive learning to minimize the overlap between distributions of synthetic images for different classes.
arXiv Detail & Related papers (2024-02-16T16:47:21Z) - Twice Class Bias Correction for Imbalanced Semi-Supervised Learning [59.90429949214134]
We introduce a novel approach called textbfTwice textbfClass textbfBias textbfCorrection (textbfTCBC)
We estimate the class bias of the model parameters during the training process.
We apply a secondary correction to the model's pseudo-labels for unlabeled samples.
arXiv Detail & Related papers (2023-12-27T15:06:36Z) - Class-Balancing Diffusion Models [57.38599989220613]
Class-Balancing Diffusion Models (CBDM) are trained with a distribution adjustment regularizer as a solution.
Our method benchmarked the generation results on CIFAR100/CIFAR100LT dataset and shows outstanding performance on the downstream recognition task.
arXiv Detail & Related papers (2023-04-30T20:00:14Z) - Intra-class Adaptive Augmentation with Neighbor Correction for Deep
Metric Learning [99.14132861655223]
We propose a novel intra-class adaptive augmentation (IAA) framework for deep metric learning.
We reasonably estimate intra-class variations for every class and generate adaptive synthetic samples to support hard samples mining.
Our method significantly improves and outperforms the state-of-the-art methods on retrieval performances by 3%-6%.
arXiv Detail & Related papers (2022-11-29T14:52:38Z) - Prototype-Anchored Learning for Learning with Imperfect Annotations [83.7763875464011]
It is challenging to learn unbiased classification models from imperfectly annotated datasets.
We propose a prototype-anchored learning (PAL) method, which can be easily incorporated into various learning-based classification schemes.
We verify the effectiveness of PAL on class-imbalanced learning and noise-tolerant learning by extensive experiments on synthetic and real-world datasets.
arXiv Detail & Related papers (2022-06-23T10:25:37Z) - An Empirical Study on the Joint Impact of Feature Selection and Data
Resampling on Imbalance Classification [4.506770920842088]
This study focuses on the synergy between feature selection and data resampling for imbalance classification.
We conduct a large amount of experiments on 52 publicly available datasets, using 9 feature selection methods, 6 resampling approaches for class imbalance learning, and 3 well-known classification algorithms.
arXiv Detail & Related papers (2021-09-01T06:01:51Z) - eGAN: Unsupervised approach to class imbalance using transfer learning [8.100450025624443]
Class imbalance is an inherent problem in many machine learning classification tasks.
We explore an unsupervised approach to address these imbalances by leveraging transfer learning from pre-trained image classification models to encoder-based Generative Adversarial Network (eGAN)
Best result of 0.69 F1-score was obtained on CIFAR-10 classification task with imbalance ratio of 1:2500.
arXiv Detail & Related papers (2021-04-09T02:37:55Z) - Robust Finite Mixture Regression for Heterogeneous Targets [70.19798470463378]
We propose an FMR model that finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously.
We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework.
The results show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-10-12T03:27:07Z) - Oversampling Adversarial Network for Class-Imbalanced Fault Diagnosis [12.526197448825968]
Class-imbalance problem requires a robust learning system which can timely predict and classify the data.
We propose a new adversarial network for simultaneous classification and fault detection.
arXiv Detail & Related papers (2020-08-07T10:12:07Z) - Imbalanced Data Learning by Minority Class Augmentation using Capsule
Adversarial Networks [31.073558420480964]
We propose a method to restore the balance in imbalanced images, by coalescing two concurrent methods.
In our model, generative and discriminative networks play a novel competitive game.
The coalescing of capsule-GAN is effective at recognizing highly overlapping classes with much fewer parameters compared with the convolutional-GAN.
arXiv Detail & Related papers (2020-04-05T12:36:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.