Learning Ising models from one or multiple samples
- URL: http://arxiv.org/abs/2004.09370v3
- Date: Thu, 10 Dec 2020 16:27:23 GMT
- Title: Learning Ising models from one or multiple samples
- Authors: Yuval Dagan, Constantinos Daskalakis, Nishanth Dikkala, Anthimos
Vardis Kandiros
- Abstract summary: We provide guarantees for one-sample estimation, quantifying the estimation error in terms of the metric entropy of a family of interaction matrices.
Our technical approach benefits from sparsifying a model's interaction network, conditioning on subsets of variables that make the dependencies in the resulting conditional distribution sufficiently weak.
- Score: 26.00403702328348
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There have been two separate lines of work on estimating Ising models: (1)
estimating them from multiple independent samples under minimal assumptions
about the model's interaction matrix; and (2) estimating them from one sample
in restrictive settings. We propose a unified framework that smoothly
interpolates between these two settings, enabling significantly richer
estimation guarantees from one, a few, or many samples.
Our main theorem provides guarantees for one-sample estimation, quantifying
the estimation error in terms of the metric entropy of a family of interaction
matrices. As corollaries of our main theorem, we derive bounds when the model's
interaction matrix is a (sparse) linear combination of known matrices, or it
belongs to a finite set, or to a high-dimensional manifold. In fact, our main
result handles multiple independent samples by viewing them as one sample from
a larger model, and can be used to derive estimation bounds that are
qualitatively similar to those obtained in the afore-described multiple-sample
literature. Our technical approach benefits from sparsifying a model's
interaction network, conditioning on subsets of variables that make the
dependencies in the resulting conditional distribution sufficiently weak. We
use this sparsification technique to prove strong concentration and
anti-concentration results for the Ising model, which we believe have
applications beyond the scope of this paper.
Related papers
- Statistically Optimal Generative Modeling with Maximum Deviation from the Empirical Distribution [2.1146241717926664]
We show that the Wasserstein GAN, constrained to left-invertible push-forward maps, generates distributions that avoid replication and significantly deviate from the empirical distribution.
Our most important contribution provides a finite-sample lower bound on the Wasserstein-1 distance between the generative distribution and the empirical one.
We also establish a finite-sample upper bound on the distance between the generative distribution and the true data-generating one.
arXiv Detail & Related papers (2023-07-31T06:11:57Z) - Revisiting the Evaluation of Image Synthesis with GANs [55.72247435112475]
This study presents an empirical investigation into the evaluation of synthesis performance, with generative adversarial networks (GANs) as a representative of generative models.
In particular, we make in-depth analyses of various factors, including how to represent a data point in the representation space, how to calculate a fair distance using selected samples, and how many instances to use from each set.
arXiv Detail & Related papers (2023-04-04T17:54:32Z) - Bayesian predictive modeling of multi-source multi-way data [0.0]
We consider molecular data from multiple 'omics sources as predictors of early-life iron deficiency (ID) in a rhesus monkey model.
We use a linear model with a low-rank structure on the coefficients to capture multi-way dependence.
We show that our model performs as expected in terms of misclassification rates and correlation of estimated coefficients with true coefficients.
arXiv Detail & Related papers (2022-08-05T21:58:23Z) - CARMS: Categorical-Antithetic-REINFORCE Multi-Sample Gradient Estimator [60.799183326613395]
We propose an unbiased estimator for categorical random variables based on multiple mutually negatively correlated (jointly antithetic) samples.
CARMS combines REINFORCE with copula based sampling to avoid duplicate samples and reduce its variance, while keeping the estimator unbiased using importance sampling.
We evaluate CARMS on several benchmark datasets on a generative modeling task, as well as a structured output prediction task, and find it to outperform competing methods including a strong self-control baseline.
arXiv Detail & Related papers (2021-10-26T20:14:30Z) - Sampling from Arbitrary Functions via PSD Models [55.41644538483948]
We take a two-step approach by first modeling the probability distribution and then sampling from that model.
We show that these models can approximate a large class of densities concisely using few evaluations, and present a simple algorithm to effectively sample from these models.
arXiv Detail & Related papers (2021-10-20T12:25:22Z) - Explaining predictive models using Shapley values and non-parametric
vine copulas [2.6774008509840996]
We propose two new approaches for modelling the dependence between the features.
The performance of the proposed methods is evaluated on simulated data sets and a real data set.
Experiments demonstrate that the vine copula approaches give more accurate approximations to the true Shapley values than its competitors.
arXiv Detail & Related papers (2021-02-12T09:43:28Z) - Robust Finite Mixture Regression for Heterogeneous Targets [70.19798470463378]
We propose an FMR model that finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously.
We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework.
The results show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-10-12T03:27:07Z) - Goal-directed Generation of Discrete Structures with Conditional
Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward.
We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z) - Machine learning for causal inference: on the use of cross-fit
estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties.
We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE)
When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z) - Automated extraction of mutual independence patterns using Bayesian
comparison of partition models [7.6146285961466]
Mutual independence is a key concept in statistics that characterizes the structural relationships between variables.
Existing methods to investigate mutual independence rely on the definition of two competing models.
We propose a general Markov chain Monte Carlo (MCMC) algorithm to numerically approximate the posterior distribution on the space of all patterns of mutual independence.
arXiv Detail & Related papers (2020-01-15T16:21:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.