IFGAN: Missing Value Imputation using Feature-specific Generative
Adversarial Networks
- URL: http://arxiv.org/abs/2012.12581v1
- Date: Wed, 23 Dec 2020 10:14:35 GMT
- Title: IFGAN: Missing Value Imputation using Feature-specific Generative
Adversarial Networks
- Authors: Wei Qiu, Yangsibo Huang, Quanzheng Li
- Abstract summary: We propose IFGAN, a missing value imputation algorithm based on Feature-specific Generative Adversarial Networks (GAN)
A feature-specific generator is trained to impute missing values, while a discriminator is expected to distinguish the imputed values from observed ones.
We empirically show on several real-life datasets that IFGAN outperforms current state-of-the-art algorithm under various missing conditions.
- Score: 14.714106979097222
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Missing value imputation is a challenging and well-researched topic in data
mining. In this paper, we propose IFGAN, a missing value imputation algorithm
based on Feature-specific Generative Adversarial Networks (GAN). Our idea is
intuitive yet effective: a feature-specific generator is trained to impute
missing values, while a discriminator is expected to distinguish the imputed
values from observed ones. The proposed architecture is capable of handling
different data types, data distributions, missing mechanisms, and missing
rates. It also improves post-imputation analysis by preserving inter-feature
correlations. We empirically show on several real-life datasets that IFGAN
outperforms current state-of-the-art algorithm under various missing
conditions.
Related papers
- Uncertainty Modeling for Out-of-Distribution Generalization [56.957731893992495]
We argue that the feature statistics can be properly manipulated to improve the generalization ability of deep learning models.
Common methods often consider the feature statistics as deterministic values measured from the learned features.
We improve the network generalization ability by modeling the uncertainty of domain shifts with synthesized feature statistics during training.
arXiv Detail & Related papers (2022-02-08T16:09:12Z) - Discovering Invariant Rationales for Graph Neural Networks [104.61908788639052]
Intrinsic interpretability of graph neural networks (GNNs) is to find a small subset of the input graph's features.
We propose a new strategy of discovering invariant rationale (DIR) to construct intrinsically interpretable GNNs.
arXiv Detail & Related papers (2022-01-30T16:43:40Z) - Multiple Imputation via Generative Adversarial Network for
High-dimensional Blockwise Missing Value Problems [6.123324869194195]
We propose Multiple Imputation via Generative Adversarial Network (MI-GAN), a deep learning-based (in specific, a GAN-based) multiple imputation method.
MI-GAN shows strong performance matching existing state-of-the-art imputation methods on high-dimensional datasets.
In particular, MI-GAN significantly outperforms other imputation methods in the sense of statistical inference and computational speed.
arXiv Detail & Related papers (2021-12-21T20:19:37Z) - MIRACLE: Causally-Aware Imputation via Learning Missing Data Mechanisms [82.90843777097606]
We propose a causally-aware imputation algorithm (MIRACLE) for missing data.
MIRACLE iteratively refines the imputation of a baseline by simultaneously modeling the missingness generating mechanism.
We conduct extensive experiments on synthetic and a variety of publicly available datasets to show that MIRACLE is able to consistently improve imputation.
arXiv Detail & Related papers (2021-11-04T22:38:18Z) - Convolutional generative adversarial imputation networks for
spatio-temporal missing data in storm surge simulations [86.5302150777089]
Generative Adversarial Imputation Nets (GANs) and GAN-based techniques have attracted attention as unsupervised machine learning methods.
We name our proposed method as Con Conval Generative Adversarial Imputation Nets (Conv-GAIN)
arXiv Detail & Related papers (2021-11-03T03:50:48Z) - Riemannian classification of EEG signals with missing values [67.90148548467762]
This paper proposes two strategies to handle missing data for the classification of electroencephalograms.
The first approach estimates the covariance from imputed data with the $k$-nearest neighbors algorithm; the second relies on the observed data by leveraging the observed-data likelihood within an expectation-maximization algorithm.
As results show, the proposed strategies perform better than the classification based on observed data and allow to keep a high accuracy even when the missing data ratio increases.
arXiv Detail & Related papers (2021-10-19T14:24:50Z) - Categorical EHR Imputation with Generative Adversarial Nets [11.171712535005357]
We propose a simple and yet effective approach that is based on previous work on GANs for data imputation.
We show that our imputation approach largely improves the prediction accuracy, compared to more traditional data imputation approaches.
arXiv Detail & Related papers (2021-08-03T18:50:26Z) - TELESTO: A Graph Neural Network Model for Anomaly Classification in
Cloud Services [77.454688257702]
Machine learning (ML) and artificial intelligence (AI) are applied on IT system operation and maintenance.
One direction aims at the recognition of re-occurring anomaly types to enable remediation automation.
We propose a method that is invariant to dimensionality changes of given data.
arXiv Detail & Related papers (2021-02-25T14:24:49Z) - Imputation of Missing Data with Class Imbalance using Conditional
Generative Adversarial Networks [24.075691766743702]
We propose a new method for imputing missing data based on its class-specific characteristics.
Our Conditional Generative Adversarial Imputation Network (CGAIN) imputes the missing data using class-specific distributions.
We tested our approach on benchmark datasets and achieved superior performance compared with the state-of-the-art and popular imputation approaches.
arXiv Detail & Related papers (2020-12-01T02:26:54Z) - PC-GAIN: Pseudo-label Conditional Generative Adversarial Imputation
Networks for Incomplete Data [19.952411963344556]
PC-GAIN is a novel unsupervised missing data imputation method named PC-GAIN.
We first propose a pre-training procedure to learn potential category information contained in a subset of low-missing-rate data.
Then an auxiliary classifier is determined using the synthetic pseudo-labels.
arXiv Detail & Related papers (2020-11-16T08:08:26Z) - Missing Features Reconstruction Using a Wasserstein Generative
Adversarial Imputation Network [0.0]
We experimentally research the use of generative and non-generative models for feature reconstruction.
Generative Autoencoder with Arbitrary Conditioning (VAEAC) and Generative Adversarial Imputation Network (GAIN) were researched as representatives of generative models.
We introduce WGAIN as the Wasserstein modification of GAIN, which turns out to be the best imputation model when the degree of missingness is less than or equal to 30%.
arXiv Detail & Related papers (2020-06-21T11:53:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.