Estimating Ising Models in Total Variation Distance
- URL: http://arxiv.org/abs/2511.21008v1
- Date: Wed, 26 Nov 2025 03:15:41 GMT
- Title: Estimating Ising Models in Total Variation Distance
- Authors: Constantinos Daskalakis, Vardis Kandiros, Rui Yao,
- Abstract summary: We consider the problem of estimating Ising models over $n$ variables in Total Variation (TV) distance, given $l$ independent samples from the model.<n>Our main contribution is a unified analysis of the Maximum Pseudo-Likelihood Estorimator (MPLE) for two general classes of Ising models.<n>Our results yield optimal or near-time algorithms and optimal or near-time sample complexity guarantees in a variety of settings.
- Score: 23.343281561400033
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We consider the problem of estimating Ising models over $n$ variables in Total Variation (TV) distance, given $l$ independent samples from the model. While the statistical complexity of the problem is well-understood [DMR20], identifying computationally and statistically efficient algorithms has been challenging. In particular, remarkable progress has occurred in several settings, such as when the underlying graph is a tree [DP21, BGPV21], when the entries of the interaction matrix follow a Gaussian distribution [GM24, CK24], or when the bulk of its eigenvalues lie in a small interval [AJK+24, KLV24], but no unified framework for polynomial-time estimation in TV exists so far. Our main contribution is a unified analysis of the Maximum Pseudo-Likelihood Estimator (MPLE) for two general classes of Ising models. The first class includes models that have bounded operator norm and satisfy the Modified Log-Sobolev Inequality (MLSI), a functional inequality that was introduced to study the convergence of the associated Glauber dynamics to stationarity. In the second class of models, the interaction matrix has bounded infinity norm (or bounded width), which is the most common assumption in the literature for structure learning of Ising models. We show how our general results for these classes yield polynomial-time algorithms and optimal or near-optimal sample complexity guarantees in a variety of settings. Our proofs employ a variety of tools from tensorization inequalities to measure decompositions and concentration bounds.
Related papers
- Statistical Inference in Classification of High-dimensional Gaussian Mixture [1.2354076490479515]
We investigate the behavior of a general class of regularized convex classifiers in the high-dimensional limit.
Our focus is on the generalization error and variable selection properties of the estimators.
arXiv Detail & Related papers (2024-10-25T19:58:36Z) - Partial-Multivariate Model for Forecasting [28.120094495344453]
We propose a Transformer-based partial-multivariate model, PMformer, for forecasting problems.
We demonstrate that PMformer outperforms various univariate and complete-multivariate models.
We also highlight other advantages of PMformer: efficiency and robustness under missing features.
arXiv Detail & Related papers (2024-08-19T05:18:50Z) - Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - Sample Complexity Characterization for Linear Contextual MDPs [67.79455646673762]
Contextual decision processes (CMDPs) describe a class of reinforcement learning problems in which the transition kernels and reward functions can change over time with different MDPs indexed by a context variable.
CMDPs serve as an important framework to model many real-world applications with time-varying environments.
We study CMDPs under two linear function approximation models: Model I with context-varying representations and common linear weights for all contexts; and Model II with common representations for all contexts and context-varying linear weights.
arXiv Detail & Related papers (2024-02-05T03:25:04Z) - Learning Graphical Factor Models with Riemannian Optimization [70.13748170371889]
This paper proposes a flexible algorithmic framework for graph learning under low-rank structural constraints.
The problem is expressed as penalized maximum likelihood estimation of an elliptical distribution.
We leverage geometries of positive definite matrices and positive semi-definite matrices of fixed rank that are well suited to elliptical models.
arXiv Detail & Related papers (2022-10-21T13:19:45Z) - Sampling Approximately Low-Rank Ising Models: MCMC meets Variational
Methods [35.24886589614034]
We consider quadratic definite Ising models on the hypercube with a general interaction $J$.
Our general result implies the first time sampling algorithms for low-rank Ising models.
arXiv Detail & Related papers (2022-02-17T21:43:50Z) - Optimal regularizations for data generation with probabilistic graphical
models [0.0]
Empirically, well-chosen regularization schemes dramatically improve the quality of the inferred models.
We consider the particular case of L 2 and L 1 regularizations in the Maximum A Posteriori (MAP) inference of generative pairwise graphical models.
arXiv Detail & Related papers (2021-12-02T14:45:16Z) - Generalized Matrix Factorization: efficient algorithms for fitting
generalized linear latent variable models to large data arrays [62.997667081978825]
Generalized Linear Latent Variable models (GLLVMs) generalize such factor models to non-Gaussian responses.
Current algorithms for estimating model parameters in GLLVMs require intensive computation and do not scale to large datasets.
We propose a new approach for fitting GLLVMs to high-dimensional datasets, based on approximating the model using penalized quasi-likelihood.
arXiv Detail & Related papers (2020-10-06T04:28:19Z) - Instability, Computational Efficiency and Statistical Accuracy [101.32305022521024]
We develop a framework that yields statistical accuracy based on interplay between the deterministic convergence rate of the algorithm at the population level, and its degree of (instability) when applied to an empirical object based on $n$ samples.
We provide applications of our general results to several concrete classes of models, including Gaussian mixture estimation, non-linear regression models, and informative non-response models.
arXiv Detail & Related papers (2020-05-22T22:30:52Z) - Learning Ising models from one or multiple samples [26.00403702328348]
We provide guarantees for one-sample estimation, quantifying the estimation error in terms of the metric entropy of a family of interaction matrices.
Our technical approach benefits from sparsifying a model's interaction network, conditioning on subsets of variables that make the dependencies in the resulting conditional distribution sufficiently weak.
arXiv Detail & Related papers (2020-04-20T15:17:05Z) - Learning Gaussian Graphical Models via Multiplicative Weights [54.252053139374205]
We adapt an algorithm of Klivans and Meka based on the method of multiplicative weight updates.
The algorithm enjoys a sample complexity bound that is qualitatively similar to others in the literature.
It has a low runtime $O(mp2)$ in the case of $m$ samples and $p$ nodes, and can trivially be implemented in an online manner.
arXiv Detail & Related papers (2020-02-20T10:50:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.