Minimax rate of consistency for linear models with missing values
- URL: http://arxiv.org/abs/2202.01463v1
- Date: Thu, 3 Feb 2022 08:45:34 GMT
- Title: Minimax rate of consistency for linear models with missing values
- Authors: Alexis Ayme (LPSM (UMR\_8001)), Claire Boyer (LPSM (UMR\_8001),
MOKAPLAN), Aymeric Dieuleveut (CMAP), Erwan Scornet (CMAP)
- Abstract summary: Missing values arise in most real-world data sets due to the aggregation of multiple sources and intrinsically missing information (sensor failure, unanswered questions in surveys...).
In this paper, we focus on the extensively-studied linear models, but in presence of missing values, which turns out to be quite a challenging task.
This eventually requires to solve a number of learning tasks, exponential in the number of input features, which makes predictions impossible for current real-world datasets.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Missing values arise in most real-world data sets due to the aggregation of
multiple sources and intrinsically missing information (sensor failure,
unanswered questions in surveys...). In fact, the very nature of missing values
usually prevents us from running standard learning algorithms. In this paper,
we focus on the extensively-studied linear models, but in presence of missing
values, which turns out to be quite a challenging task. Indeed, the Bayes rule
can be decomposed as a sum of predictors corresponding to each missing pattern.
This eventually requires to solve a number of learning tasks, exponential in
the number of input features, which makes predictions impossible for current
real-world datasets. First, we propose a rigorous setting to analyze a
least-square type estimator and establish a bound on the excess risk which
increases exponentially in the dimension. Consequently, we leverage the missing
data distribution to propose a new algorithm, andderive associated adaptive
risk bounds that turn out to be minimax optimal. Numerical experiments
highlight the benefits of our method compared to state-of-the-art algorithms
used for predictions with missing values.
Related papers
- Probabilistic Imputation for Time-series Classification with Missing
Data [17.956329906475084]
We propose a novel framework for classification with time series data with missing values.
Our deep generative model part is trained to impute the missing values in multiple plausible ways.
The classifier part takes the time series data along with the imputed missing values and classifies signals.
arXiv Detail & Related papers (2023-08-13T10:04:13Z) - X-model: Improving Data Efficiency in Deep Learning with A Minimax Model [78.55482897452417]
We aim at improving data efficiency for both classification and regression setups in deep learning.
To take the power of both worlds, we propose a novel X-model.
X-model plays a minimax game between the feature extractor and task-specific heads.
arXiv Detail & Related papers (2021-10-09T13:56:48Z) - Greedy structure learning from data that contains systematic missing
values [13.088541054366527]
Learning from data that contain missing values represents a common phenomenon in many domains.
Relatively few Bayesian Network structure learning algorithms account for missing data.
This paper describes three variants of greedy search structure learning that utilise pairwise deletion and inverse probability weighting.
arXiv Detail & Related papers (2021-07-09T02:56:44Z) - Near-optimal inference in adaptive linear regression [60.08422051718195]
Even simple methods like least squares can exhibit non-normal behavior when data is collected in an adaptive manner.
We propose a family of online debiasing estimators to correct these distributional anomalies in at least squares estimation.
We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.
arXiv Detail & Related papers (2021-07-05T21:05:11Z) - Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z) - Evaluating State-of-the-Art Classification Models Against Bayes
Optimality [106.50867011164584]
We show that we can compute the exact Bayes error of generative models learned using normalizing flows.
We use our approach to conduct a thorough investigation of state-of-the-art classification models.
arXiv Detail & Related papers (2021-06-07T06:21:20Z) - NeuMiss networks: differentiable programming for supervised learning
with missing values [0.0]
We derive the analytical form of the optimal predictor under a linearity assumption.
We propose a new principled architecture, named NeuMiss networks.
They have good predictive accuracy with both a number of parameters and a computational complexity independent of the number of missing data patterns.
arXiv Detail & Related papers (2020-07-03T11:42:25Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - Matrix Completion with Quantified Uncertainty through Low Rank Gaussian
Copula [30.84155327760468]
This paper proposes a framework for missing value imputation with quantified uncertainty.
The time required to fit the model scales linearly with the number of rows and the number of columns in the dataset.
Empirical results show the method yields state-of-the-art imputation accuracy across a wide range of data types.
arXiv Detail & Related papers (2020-06-18T19:51:42Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z) - On the consistency of supervised learning with missing values [15.666860186278782]
In many application settings, the data have missing entries which make analysis challenging.
Here, we consider supervised-learning settings: predicting a target when missing values appear in both training and testing data.
We show that the widely-used method of imputing with a constant, such as the mean prior to learning, is consistent when missing values are not informative.
arXiv Detail & Related papers (2019-02-19T07:27:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.