Learning Ising models from one or multiple samples
- URL: http://arxiv.org/abs/2004.09370v3
- Date: Thu, 10 Dec 2020 16:27:23 GMT
- Title: Learning Ising models from one or multiple samples
- Authors: Yuval Dagan, Constantinos Daskalakis, Nishanth Dikkala, Anthimos
Vardis Kandiros
- Abstract summary: We provide guarantees for one-sample estimation, quantifying the estimation error in terms of the metric entropy of a family of interaction matrices.
Our technical approach benefits from sparsifying a model's interaction network, conditioning on subsets of variables that make the dependencies in the resulting conditional distribution sufficiently weak.
- Score: 26.00403702328348
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There have been two separate lines of work on estimating Ising models: (1)
estimating them from multiple independent samples under minimal assumptions
about the model's interaction matrix; and (2) estimating them from one sample
in restrictive settings. We propose a unified framework that smoothly
interpolates between these two settings, enabling significantly richer
estimation guarantees from one, a few, or many samples.
Our main theorem provides guarantees for one-sample estimation, quantifying
the estimation error in terms of the metric entropy of a family of interaction
matrices. As corollaries of our main theorem, we derive bounds when the model's
interaction matrix is a (sparse) linear combination of known matrices, or it
belongs to a finite set, or to a high-dimensional manifold. In fact, our main
result handles multiple independent samples by viewing them as one sample from
a larger model, and can be used to derive estimation bounds that are
qualitatively similar to those obtained in the afore-described multiple-sample
literature. Our technical approach benefits from sparsifying a model's
interaction network, conditioning on subsets of variables that make the
dependencies in the resulting conditional distribution sufficiently weak. We
use this sparsification technique to prove strong concentration and
anti-concentration results for the Ising model, which we believe have
applications beyond the scope of this paper.
Related papers
- Model-free Methods for Event History Analysis and Efficient Adjustment (PhD Thesis) [55.2480439325792]
This thesis is a series of independent contributions to statistics unified by a model-free perspective.
The first chapter elaborates on how a model-free perspective can be used to formulate flexible methods that leverage prediction techniques from machine learning.
The second chapter studies the concept of local independence, which describes whether the evolution of one process is directly influenced by another.
arXiv Detail & Related papers (2025-02-11T19:24:09Z) - Bridging the inference gap in Mutimodal Variational Autoencoders [6.246098300155483]
Multimodal Variational Autoencoders offer versatile and scalable methods for generating unobserved modalities from observed ones.
Recent models using mixturesof-experts aggregation suffer from theoretically grounded limitations that restrict their generation quality on complex datasets.
We propose a novel interpretable model able to learn both joint and conditional distributions without introducing mixture aggregation.
arXiv Detail & Related papers (2025-02-06T10:43:55Z) - Statistically Optimal Generative Modeling with Maximum Deviation from the Empirical Distribution [2.1146241717926664]
We show that the Wasserstein GAN, constrained to left-invertible push-forward maps, generates distributions that avoid replication and significantly deviate from the empirical distribution.
Our most important contribution provides a finite-sample lower bound on the Wasserstein-1 distance between the generative distribution and the empirical one.
We also establish a finite-sample upper bound on the distance between the generative distribution and the true data-generating one.
arXiv Detail & Related papers (2023-07-31T06:11:57Z) - Revisiting the Evaluation of Image Synthesis with GANs [55.72247435112475]
This study presents an empirical investigation into the evaluation of synthesis performance, with generative adversarial networks (GANs) as a representative of generative models.
In particular, we make in-depth analyses of various factors, including how to represent a data point in the representation space, how to calculate a fair distance using selected samples, and how many instances to use from each set.
arXiv Detail & Related papers (2023-04-04T17:54:32Z) - Bayesian predictive modeling of multi-source multi-way data [0.0]
We consider molecular data from multiple 'omics sources as predictors of early-life iron deficiency (ID) in a rhesus monkey model.
We use a linear model with a low-rank structure on the coefficients to capture multi-way dependence.
We show that our model performs as expected in terms of misclassification rates and correlation of estimated coefficients with true coefficients.
arXiv Detail & Related papers (2022-08-05T21:58:23Z) - CARMS: Categorical-Antithetic-REINFORCE Multi-Sample Gradient Estimator [60.799183326613395]
We propose an unbiased estimator for categorical random variables based on multiple mutually negatively correlated (jointly antithetic) samples.
CARMS combines REINFORCE with copula based sampling to avoid duplicate samples and reduce its variance, while keeping the estimator unbiased using importance sampling.
We evaluate CARMS on several benchmark datasets on a generative modeling task, as well as a structured output prediction task, and find it to outperform competing methods including a strong self-control baseline.
arXiv Detail & Related papers (2021-10-26T20:14:30Z) - Sampling from Arbitrary Functions via PSD Models [55.41644538483948]
We take a two-step approach by first modeling the probability distribution and then sampling from that model.
We show that these models can approximate a large class of densities concisely using few evaluations, and present a simple algorithm to effectively sample from these models.
arXiv Detail & Related papers (2021-10-20T12:25:22Z) - Goal-directed Generation of Discrete Structures with Conditional
Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward.
We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z) - Machine learning for causal inference: on the use of cross-fit
estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties.
We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE)
When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z) - Automated extraction of mutual independence patterns using Bayesian
comparison of partition models [7.6146285961466]
Mutual independence is a key concept in statistics that characterizes the structural relationships between variables.
Existing methods to investigate mutual independence rely on the definition of two competing models.
We propose a general Markov chain Monte Carlo (MCMC) algorithm to numerically approximate the posterior distribution on the space of all patterns of mutual independence.
arXiv Detail & Related papers (2020-01-15T16:21:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.