Related papers: Fundamental limits of distributed covariance matrix estimation via a conditional strong data processing inequality

Fundamental limits of distributed covariance matrix estimation via a conditional strong data processing inequality

URL: http://arxiv.org/abs/2507.16953v1
Date: Tue, 22 Jul 2025 18:50:02 GMT
Title: Fundamental limits of distributed covariance matrix estimation via a conditional strong data processing inequality
Authors: Mohammad Reza Rahmani, Mohammad Hossein Yassaee, Mohammad Reza Aref,
Abstract summary: We study a scenario in which multiple agents each observe different components of i.i.d. samples drawn from a sub-Gaussian random vector.<n>A central server seeks to estimate the complete covariance matrix using a limited number of bits communicated by each agent.<n>Our main technical tool is a novel generalization of the strong data processing inequality (SDPI), termed the Conditional Strong Data Processing Inequality (C-SDPI) coefficient.
Score: 5.134222224728979
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Estimating high-dimensional covariance matrices is a key task across many fields. This paper explores the theoretical limits of distributed covariance estimation in a feature-split setting, where communication between agents is constrained. Specifically, we study a scenario in which multiple agents each observe different components of i.i.d. samples drawn from a sub-Gaussian random vector. A central server seeks to estimate the complete covariance matrix using a limited number of bits communicated by each agent. We obtain a nearly tight minimax lower bound for covariance matrix estimation under operator norm and Frobenius norm. Our main technical tool is a novel generalization of the strong data processing inequality (SDPI), termed the Conditional Strong Data Processing Inequality (C-SDPI) coefficient, introduced in this work. The C-SDPI coefficient shares key properties such as tensorization with the conventional SDPI. Crucially, it quantifies the average contraction in a state-dependent channel and can be significantly lower than the worst-case SDPI coefficient over the state input. Utilizing the doubling trick of Geng-Nair and an operator Jensen inequality, we compute this coefficient for Gaussian mixture channels. We then employ it to establish minimax lower bounds on estimation error, capturing the trade-offs among sample size, communication cost, and data dimensionality. Building on this, we present a nearly optimal estimation protocol whose sample and communication requirements match the lower bounds up to logarithmic factors. Unlike much of the existing literature, our framework does not assume infinite samples or Gaussian distributions, making it broadly applicable. Finally, we extend our analysis to interactive protocols, showing interaction can significantly reduce communication requirements compared to non-interactive schemes.

Related papers

Efficient Covariance Estimation for Sparsified Functional Data [51.69796254617083]
proposed Random-knots (Random-knots-Spatial) and B-spline (Bspline-Spatial) estimators of the covariance function are computationally efficient.<n>Asymptotic pointwise of the covariance are obtained for sparsified individual trajectories under some regularity conditions.
arXiv Detail & Related papers (2025-11-23T00:50:33Z)
Semiparametric conformal prediction [79.6147286161434]
We construct a conformal prediction set accounting for the joint correlation structure of the vector-valued non-conformity scores.<n>We flexibly estimate the joint cumulative distribution function (CDF) of the scores.<n>Our method yields desired coverage and competitive efficiency on a range of real-world regression problems.
arXiv Detail & Related papers (2024-11-04T14:29:02Z)
Generalization error of min-norm interpolators in transfer learning [2.7309692684728617]
Min-norm interpolators emerge naturally as implicit regularized limits of modern machine learning algorithms. In many applications, a limited amount of test data may be available during training, yet properties of min-norm in this setting are not well-understood. We establish a novel anisotropic local law to achieve these characterizations.
arXiv Detail & Related papers (2024-06-20T02:23:28Z)
Optimal Differentially Private PCA and Estimation for Spiked Covariance Matrices [10.377683220196873]
Estimating a covariance matrix and its associated principal components is a fundamental problem in contemporary statistics. We study optimal differentially private Principal Component Analysis (PCA) and covariance estimation within the spiked covariance model. We propose computationally efficient differentially private estimators and prove their minimax optimality for sub-Gaussian distributions.
arXiv Detail & Related papers (2024-01-08T11:18:14Z)
A Parameter-Free Two-Bit Covariance Estimator with Improved Operator Norm Error Rate [23.116373524718988]
We propose a new 2-bit covariance matrix estimator that simultaneously addresses both issues. By employing dithering scales varying across entries, our estimator enjoys an improved operator norm error rate. Our proposed method eliminates the need of any tuning parameter, as the dithering scales are entirely determined by the data.
arXiv Detail & Related papers (2023-08-30T14:31:24Z)
Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture. It can model the feature space more comprehensively and reduce the dominance of head classes. The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z)
Distributed Nonparametric Estimation under Communication Constraints [0.0]
We provide a general framework for understanding the behavior of distributed estimation under communication constraints. We derive minimax lower and matching upper bounds in the distributed regression, density estimation, classification, Poisson regression and volatility estimation models.
arXiv Detail & Related papers (2022-04-21T19:04:50Z)
A Unified Framework for Multi-distribution Density Ratio Estimation [101.67420298343512]
Binary density ratio estimation (DRE) provides the foundation for many state-of-the-art machine learning algorithms. We develop a general framework from the perspective of Bregman minimization divergence. We show that our framework leads to methods that strictly generalize their counterparts in binary DRE.
arXiv Detail & Related papers (2021-12-07T01:23:20Z)
Acceleration in Distributed Optimization Under Similarity [72.54787082152278]
We study distributed (strongly convex) optimization problems over a network of agents, with no centralized nodes. An $varepsilon$-solution is achieved in $tildemathcalrhoObig(sqrtfracbeta/mu (1-)log1/varepsilonbig)$ number of communications steps. This rate matches (up to poly-log factors) for the first time lower complexity communication bounds of distributed gossip-algorithms applied to the class of problems of interest.
arXiv Detail & Related papers (2021-10-24T04:03:00Z)
Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic Uncertainty [58.144520501201995]
Bi-Lipschitz regularization of neural network layers preserve relative distances between data instances in the feature spaces of each layer. With the use of an attentive set encoder, we propose to meta learn either diagonal or diagonal plus low-rank factors to efficiently construct task specific covariance matrices. We also propose an inference procedure which utilizes scaled energy to achieve a final predictive distribution.
arXiv Detail & Related papers (2021-10-12T22:04:19Z)
Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores) For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training. We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z)
New Bounds For Distributed Mean Estimation and Variance Reduction [25.815612182815702]
We consider the problem of distributed mean estimation (DME) in which $n$ machines are each given a local $d$-dimensional vector $x_v in mathbbRd$. We show that our method yields practical improvements for common applications, relative to prior approaches.
arXiv Detail & Related papers (2020-02-21T13:27:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.