Related papers: Exploiting Observation Bias to Improve Matrix Completion

Exploiting Observation Bias to Improve Matrix Completion

URL: http://arxiv.org/abs/2306.04775v2
Date: Mon, 5 Feb 2024 00:25:10 GMT
Title: Exploiting Observation Bias to Improve Matrix Completion
Authors: Yassir Jedra, Sean Mann, Charlotte Park, Devavrat Shah
Abstract summary: We consider a variant of matrix completion where entries are revealed in a biased manner. The goal is to exploit the shared information between the bias and the outcome of interest to improve predictions. We find that with this two-stage algorithm, the estimates have 30x smaller mean squared error compared to traditional matrix completion methods.
Score: 16.57405742112833
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider a variant of matrix completion where entries are revealed in a biased manner, adopting a model akin to that introduced by Ma and Chen. Instead of treating this observation bias as a disadvantage, as is typically the case, the goal is to exploit the shared information between the bias and the outcome of interest to improve predictions. Towards this, we consider a natural model where the observation pattern and outcome of interest are driven by the same set of underlying latent or unobserved factors. This leads to a two stage matrix completion algorithm: first, recover (distances between) the latent factors by utilizing matrix completion for the fully observed noisy binary matrix corresponding to the observation pattern; second, utilize the recovered latent factors as features and sparsely observed noisy outcomes as labels to perform non-parametric supervised learning. The finite-sample error rates analysis suggests that, ignoring logarithmic factors, this approach is competitive with the corresponding supervised learning parametric rates. This implies the two-stage method has performance that is comparable to having access to the unobserved latent factors through exploiting the shared information between the bias and outcomes. Through empirical evaluation using a real-world dataset, we find that with this two-stage algorithm, the estimates have 30x smaller mean squared error compared to traditional matrix completion methods, suggesting the utility of the model and the method proposed in this work.

Related papers

MANO: Exploiting Matrix Norm for Unsupervised Accuracy Estimation Under Distribution Shifts [25.643876327918544]
Leveraging the models' outputs, specifically the logits, is a common approach to estimating the test accuracy of a pre-trained neural network on out-of-distribution samples. Despite their ease of implementation and computational efficiency, current logit-based methods are vulnerable to overconfidence issues, leading to prediction bias. We propose MaNo which applies a data-dependent normalization on the logits to reduce prediction bias and takes the $L_p$ norm of the matrix of normalized logits as the estimation score.
arXiv Detail & Related papers (2024-05-29T10:45:06Z)
Approximating Counterfactual Bounds while Fusing Observational, Biased and Randomised Data Sources [64.96984404868411]
We address the problem of integrating data from multiple, possibly biased, observational and interventional studies. We show that the likelihood of the available data has no local maxima. We then show how the same approach can address the general case of multiple datasets.
arXiv Detail & Related papers (2023-07-31T11:28:24Z)
A Generalized Latent Factor Model Approach to Mixed-data Matrix Completion with Entrywise Consistency [3.299672391663527]
Matrix completion is a class of machine learning methods that concerns the prediction of missing entries in a partially observed matrix. We formulate it as a low-rank matrix estimation problem under a general family of non-linear factor models. We propose entrywise consistent estimators for estimating the low-rank matrix.
arXiv Detail & Related papers (2022-11-17T00:24:47Z)
Instance-Dependent Label-Noise Learning with Manifold-Regularized Transition Matrix Estimation [172.81824511381984]
The transition matrix T(x) is unidentifiable under the instance-dependent noise(IDN) We propose assumption on the geometry of T(x) that "the closer two instances are, the more similar their corresponding transition matrices should be" Our method is superior to state-of-the-art approaches for label-noise learning under the challenging IDN.
arXiv Detail & Related papers (2022-06-06T04:12:01Z)
Never mind the metrics -- what about the uncertainty? Visualising confusion matrix metric distributions [6.566615606042994]
This paper strives for a more balanced perspective on classifier performance metrics by highlighting their distributions under different models of uncertainty. We develop equations, animations and interactive visualisations of the contours of performance metrics within (and beyond) this ROC space. Our hope is that these insights and visualisations will raise greater awareness of the substantial uncertainty in performance metric estimates.
arXiv Detail & Related papers (2022-06-05T11:54:59Z)
Coordinated Double Machine Learning [8.808993671472349]
This paper argues that a carefully coordinated learning algorithm for deep neural networks may reduce the estimation bias. The improved empirical performance of the proposed method is demonstrated through numerical experiments on both simulated and real data.
arXiv Detail & Related papers (2022-06-02T05:56:21Z)
Generalization bounds and algorithms for estimating conditional average treatment effect of dosage [13.867315751451494]
We investigate the task of estimating the conditional average causal effect of treatment-dosage pairs from a combination of observational data and assumptions on the causal relationships in the underlying system. This has been a longstanding challenge for fields of study such as epidemiology or economics that require a treatment-dosage pair to make decisions. We show empirically new state-of-the-art performance results across several benchmark datasets for this problem.
arXiv Detail & Related papers (2022-05-29T15:26:59Z)
Masked prediction tasks: a parameter identifiability view [49.533046139235466]
We focus on the widely used self-supervised learning method of predicting masked tokens. We show that there is a rich landscape of possibilities, out of which some prediction tasks yield identifiability, while others do not.
arXiv Detail & Related papers (2022-02-18T17:09:32Z)
Scalable Intervention Target Estimation in Linear Models [52.60799340056917]
Current approaches to causal structure learning either work with known intervention targets or use hypothesis testing to discover the unknown intervention targets. This paper proposes a scalable and efficient algorithm that consistently identifies all intervention targets. The proposed algorithm can be used to also update a given observational Markov equivalence class into the interventional Markov equivalence class.
arXiv Detail & Related papers (2021-11-15T03:16:56Z)
Riemannian classification of EEG signals with missing values [67.90148548467762]
This paper proposes two strategies to handle missing data for the classification of electroencephalograms. The first approach estimates the covariance from imputed data with the $k$-nearest neighbors algorithm; the second relies on the observed data by leveraging the observed-data likelihood within an expectation-maximization algorithm. As results show, the proposed strategies perform better than the classification based on observed data and allow to keep a high accuracy even when the missing data ratio increases.
arXiv Detail & Related papers (2021-10-19T14:24:50Z)
Doing Great at Estimating CATE? On the Neglected Assumptions in Benchmark Comparisons of Treatment Effect Estimators [91.3755431537592]
We show that even in arguably the simplest setting, estimation under ignorability assumptions can be misleading. We consider two popular machine learning benchmark datasets for evaluation of heterogeneous treatment effect estimators. We highlight that the inherent characteristics of the benchmark datasets favor some algorithms over others.
arXiv Detail & Related papers (2021-07-28T13:21:27Z)
Counterfactual Maximum Likelihood Estimation for Training Deep Networks [83.44219640437657]
Deep learning models are prone to learning spurious correlations that should not be learned as predictive clues. We propose a causality-based training framework to reduce the spurious correlations caused by observable confounders. We conduct experiments on two real-world tasks: Natural Language Inference (NLI) and Image Captioning.
arXiv Detail & Related papers (2021-06-07T17:47:16Z)
Machine learning for causal inference: on the use of cross-fit estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties. We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE) When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z)
Asymptotic Analysis of an Ensemble of Randomly Projected Linear Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets. We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator. We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z)
Estimating Treatment Effects with Observed Confounders and Mediators [25.338901482522648]
Given a causal graph, the do-calculus can express treatment effects as functionals of the observational joint distribution that can be estimated empirically. Sometimes the do-calculus identifies multiple valid formulae, prompting us to compare the statistical properties of the corresponding estimators. In this paper, we investigate the over-identified scenario where both confounders and mediators are observed, rendering both estimators valid.
arXiv Detail & Related papers (2020-03-26T15:50:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.