Imputation-Free Learning from Incomplete Observations
- URL: http://arxiv.org/abs/2107.01983v1
- Date: Mon, 5 Jul 2021 12:44:39 GMT
- Title: Imputation-Free Learning from Incomplete Observations
- Authors: Qitong Gao, Dong Wang, Joshua D. Amason, Siyang Yuan, Chenyang Tao,
Ricardo Henao, Majda Hadziahmetovic, Lawrence Carin, Miroslav Pajic
- Abstract summary: We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
- Score: 73.15386629370111
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Although recent works have developed methods that can generate estimations
(or imputations) of the missing entries in a dataset to facilitate downstream
analysis, most depend on assumptions that may not align with real-world
applications and could suffer from poor performance in subsequent tasks. This
is particularly true if the data have large missingness rates or a small
population. More importantly, the imputation error could be propagated into the
prediction step that follows, causing the gradients used to train the
prediction models to be biased. Consequently, in this work, we introduce the
importance guided stochastic gradient descent (IGSGD) method to train
multilayer perceptrons (MLPs) and long short-term memories (LSTMs) to directly
perform inference from inputs containing missing values without imputation.
Specifically, we employ reinforcement learning (RL) to adjust the gradients
used to train the models via back-propagation. This not only reduces bias but
allows the model to exploit the underlying information behind missingness
patterns. We test the proposed approach on real-world time-series (i.e.,
MIMIC-III), tabular data obtained from an eye clinic, and a standard dataset
(i.e., MNIST), where our imputation-free predictions outperform the traditional
two-step imputation-based predictions using state-of-the-art imputation
methods.
Related papers
- Low-rank finetuning for LLMs: A fairness perspective [54.13240282850982]
Low-rank approximation techniques have become the de facto standard for fine-tuning Large Language Models.
This paper investigates the effectiveness of these methods in capturing the shift of fine-tuning datasets from the initial pre-trained data distribution.
We show that low-rank fine-tuning inadvertently preserves undesirable biases and toxic behaviors.
arXiv Detail & Related papers (2024-05-28T20:43:53Z) - Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning [28.059563581973432]
Large Language Models (LLMs) often have sensitive, private, or copyrighted data during pre-training.
LLMs unlearning aims to eliminate the influence of undesirable data from the pre-trained model.
We propose Negative Preference Optimization (NPO) as a simple alignment-inspired method that could efficiently unlearn a target dataset.
arXiv Detail & Related papers (2024-04-08T21:05:42Z) - Random features models: a way to study the success of naive imputation [0.0]
Constant (naive) imputation is still widely used in practice as this is a first easy-to-use technique to deal with missing data.
Recent works suggest that this bias is low in the context of high-dimensional linear predictors.
This paper confirms the intuition that the bias is negligible and that surprisingly naive imputation also remains relevant in very low dimension.
arXiv Detail & Related papers (2024-02-06T09:37:06Z) - PROMISSING: Pruning Missing Values in Neural Networks [0.0]
We propose a simple and intuitive yet effective method for pruning missing values (PROMISSING) during learning and inference steps in neural networks.
Our experiments show that PROMISSING results in similar prediction performance compared to various imputation techniques.
arXiv Detail & Related papers (2022-06-03T15:37:27Z) - MissDAG: Causal Discovery in the Presence of Missing Data with
Continuous Additive Noise Models [78.72682320019737]
We develop a general method, which we call MissDAG, to perform causal discovery from data with incomplete observations.
MissDAG maximizes the expected likelihood of the visible part of observations under the expectation-maximization framework.
We demonstrate the flexibility of MissDAG for incorporating various causal discovery algorithms and its efficacy through extensive simulations and real data experiments.
arXiv Detail & Related papers (2022-05-27T09:59:46Z) - On the Implicit Bias of Gradient Descent for Temporal Extrapolation [32.93066466540839]
Common practice when using recurrent neural networks (RNNs) is to apply a model to sequences longer than those seen in training.
We show that even with infinite training data, there exist RNN models that interpolate perfectly.
We then show that if gradient descent is used for training, learning will converge to perfect extrapolation.
arXiv Detail & Related papers (2022-02-09T06:28:37Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Scalable Marginal Likelihood Estimation for Model Selection in Deep
Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties.
Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z) - Time-Series Imputation with Wasserstein Interpolation for Optimal
Look-Ahead-Bias and Variance Tradeoff [66.59869239999459]
In finance, imputation of missing returns may be applied prior to training a portfolio optimization model.
There is an inherent trade-off between the look-ahead-bias of using the full data set for imputation and the larger variance in the imputation from using only the training data.
We propose a Bayesian posterior consensus distribution which optimally controls the variance and look-ahead-bias trade-off in the imputation.
arXiv Detail & Related papers (2021-02-25T09:05:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.