Missingness Augmentation: A General Approach for Improving Generative
Imputation Models
- URL: http://arxiv.org/abs/2108.02566v2
- Date: Thu, 6 Apr 2023 06:05:14 GMT
- Title: Missingness Augmentation: A General Approach for Improving Generative
Imputation Models
- Authors: Yufeng Wang, Dan Li, Cong Xu, Min Yang
- Abstract summary: We propose a novel data augmentation method called Missingness Augmentation (MisA) for generative imputation models.
As a general augmentation technique, MisA can be easily integrated into generative imputation frameworks.
Experimental results demonstrate that MisA significantly improves the performance of many recently proposed generative imputation models.
- Score: 20.245637164975594
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Missing data imputation is a fundamental problem in data analysis, and many
studies have been conducted to improve its performance by exploring model
structures and learning procedures. However, data augmentation, as a simple yet
effective method, has not received enough attention in this area. In this
paper, we propose a novel data augmentation method called Missingness
Augmentation (MisA) for generative imputation models. Our approach dynamically
produces incomplete samples at each epoch by utilizing the generator's output,
constraining the augmented samples using a simple reconstruction loss, and
combining this loss with the original loss to form the final optimization
objective. As a general augmentation technique, MisA can be easily integrated
into generative imputation frameworks, providing a simple yet effective way to
enhance their performance. Experimental results demonstrate that MisA
significantly improves the performance of many recently proposed generative
imputation models on a variety of tabular and image datasets. The code is
available at \url{https://github.com/WYu-Feng/Missingness-Augmentation}.
Related papers
- A Simple Background Augmentation Method for Object Detection with Diffusion Model [53.32935683257045]
In computer vision, it is well-known that a lack of data diversity will impair model performance.
We propose a simple yet effective data augmentation approach by leveraging advancements in generative models.
Background augmentation, in particular, significantly improves the models' robustness and generalization capabilities.
arXiv Detail & Related papers (2024-08-01T07:40:00Z) - DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception [78.26734070960886]
Current perceptive models heavily depend on resource-intensive datasets.
We introduce perception-aware loss (P.A. loss) through segmentation, improving both quality and controllability.
Our method customizes data augmentation by extracting and utilizing perception-aware attribute (P.A. Attr) during generation.
arXiv Detail & Related papers (2024-03-20T04:58:03Z) - Regularizing Neural Networks with Meta-Learning Generative Models [40.45689466486025]
We present a novel strategy for generative data augmentation called meta generative regularization (MGR)
MGR utilizes synthetic samples in the regularization term for feature extractors instead of in the loss function, e.g., cross-entropy.
Experiments on six datasets showed that MGR is effective particularly when datasets are smaller and stably outperforms baselines.
arXiv Detail & Related papers (2023-07-26T01:47:49Z) - Data Augmentation for Seizure Prediction with Generative Diffusion Model [26.967247641926814]
Seizure prediction is of great importance to improve the life of patients.
The severe imbalance problem between preictal and interictal data still poses a great challenge.
Data augmentation is an intuitive way to solve this problem.
We propose a novel data augmentation method with diffusion model called DiffEEG.
arXiv Detail & Related papers (2023-06-14T05:44:53Z) - Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data.
We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z) - ScoreMix: A Scalable Augmentation Strategy for Training GANs with
Limited Data [93.06336507035486]
Generative Adversarial Networks (GANs) typically suffer from overfitting when limited training data is available.
We present ScoreMix, a novel and scalable data augmentation approach for various image synthesis tasks.
arXiv Detail & Related papers (2022-10-27T02:55:15Z) - Reweighting Augmented Samples by Minimizing the Maximal Expected Loss [51.2791895511333]
We construct the maximal expected loss which is the supremum over any reweighted loss on augmented samples.
Inspired by adversarial training, we minimize this maximal expected loss and obtain a simple and interpretable closed-form solution.
The proposed method can generally be applied on top of any data augmentation methods.
arXiv Detail & Related papers (2021-03-16T09:31:04Z) - MAIN: Multihead-Attention Imputation Networks [4.427447378048202]
We propose a novel mechanism based on multi-head attention which can be applied effortlessly in any model.
Our method inductively models patterns of missingness in the input data in order to increase the performance of the downstream task.
arXiv Detail & Related papers (2021-02-10T13:50:02Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.