Robust Online Covariance and Sparse Precision Estimation Under Arbitrary
Data Corruption
- URL: http://arxiv.org/abs/2309.08884v1
- Date: Sat, 16 Sep 2023 05:37:28 GMT
- Title: Robust Online Covariance and Sparse Precision Estimation Under Arbitrary
Data Corruption
- Authors: Tong Yao, Shreyas Sundaram
- Abstract summary: We introduce a modified trimmed-inner-product algorithm to robustly estimate the covariance in an online scenario.
We provide the error-bound and convergence properties of the estimates to the true precision matrix under our algorithms.
- Score: 1.5850859526672516
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Gaussian graphical models are widely used to represent correlations among
entities but remain vulnerable to data corruption. In this work, we introduce a
modified trimmed-inner-product algorithm to robustly estimate the covariance in
an online scenario even in the presence of arbitrary and adversarial data
attacks. At each time step, data points, drawn nominally independently and
identically from a multivariate Gaussian distribution, arrive. However, a
certain fraction of these points may have been arbitrarily corrupted. We
propose an online algorithm to estimate the sparse inverse covariance (i.e.,
precision) matrix despite this corruption. We provide the error-bound and
convergence properties of the estimates to the true precision matrix under our
algorithms.
Related papers
- Optimal Robust Estimation under Local and Global Corruptions: Stronger Adversary and Smaller Error [10.266928164137635]
Algorithmic robust statistics has traditionally focused on the contamination model where a small fraction of the samples are arbitrarily corrupted.
We consider a recent contamination model that combines two kinds of corruptions: (i) small fraction of arbitrary outliers, as in classical robust statistics, and (ii) local perturbations, where samples may undergo bounded shifts on average.
We show that information theoretically optimal error can indeed be achieved in time, under an even emphstronger local perturbation model.
arXiv Detail & Related papers (2024-10-22T17:51:23Z) - A Mirror Descent-Based Algorithm for Corruption-Tolerant Distributed Gradient Descent [57.64826450787237]
We show how to analyze the behavior of distributed gradient descent algorithms in the presence of adversarial corruptions.
We show how to use ideas from (lazy) mirror descent to design a corruption-tolerant distributed optimization algorithm.
Experiments based on linear regression, support vector classification, and softmax classification on the MNIST dataset corroborate our theoretical findings.
arXiv Detail & Related papers (2024-07-19T08:29:12Z) - Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - Equivariance Discovery by Learned Parameter-Sharing [153.41877129746223]
We study how to discover interpretable equivariances from data.
Specifically, we formulate this discovery process as an optimization problem over a model's parameter-sharing schemes.
Also, we theoretically analyze the method for Gaussian data and provide a bound on the mean squared gap between the studied discovery scheme and the oracle scheme.
arXiv Detail & Related papers (2022-04-07T17:59:19Z) - Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic
Uncertainty [58.144520501201995]
Bi-Lipschitz regularization of neural network layers preserve relative distances between data instances in the feature spaces of each layer.
With the use of an attentive set encoder, we propose to meta learn either diagonal or diagonal plus low-rank factors to efficiently construct task specific covariance matrices.
We also propose an inference procedure which utilizes scaled energy to achieve a final predictive distribution.
arXiv Detail & Related papers (2021-10-12T22:04:19Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - On the Error Resistance of Hinge Loss Minimization [30.808062097285706]
We identify a set of conditions on the data under which surrogate loss minimization algorithms provably learn the correct classifier.
In particular, we show that if the data is linearly classifiable with a slightly non-trivial margin, then surrogate loss minimization has negligible error on the uncorrupted data.
arXiv Detail & Related papers (2020-12-02T06:49:24Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - Matrix Completion with Quantified Uncertainty through Low Rank Gaussian
Copula [30.84155327760468]
This paper proposes a framework for missing value imputation with quantified uncertainty.
The time required to fit the model scales linearly with the number of rows and the number of columns in the dataset.
Empirical results show the method yields state-of-the-art imputation accuracy across a wide range of data types.
arXiv Detail & Related papers (2020-06-18T19:51:42Z) - Fitting Laplacian Regularized Stratified Gaussian Models [0.0]
We consider the problem of jointly estimating multiple related zero-mean Gaussian distributions from data.
We propose a distributed method that scales to large problems, and illustrate the efficacy of the method with examples in finance, radar signal processing, and weather forecasting.
arXiv Detail & Related papers (2020-05-04T18:00:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.