Optimal Distributed Subsampling for Maximum Quasi-Likelihood Estimators
with Massive Data
- URL: http://arxiv.org/abs/2005.10435v3
- Date: Mon, 5 Jul 2021 15:32:22 GMT
- Title: Optimal Distributed Subsampling for Maximum Quasi-Likelihood Estimators
with Massive Data
- Authors: Jun Yu, HaiYing Wang, Mingyao Ai and Huiming Zhang
- Abstract summary: Existing methods mostly focus on subsampling with replacement due to its high computational efficiency.
We first derive optimal subsampling probabilities in the context of quasi-likelihood estimation.
We develop a distributed subsampling framework, in which statistics are computed simultaneously on smaller partitions of the full data.
- Score: 20.79270369203348
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Nonuniform subsampling methods are effective to reduce computational burden
and maintain estimation efficiency for massive data. Existing methods mostly
focus on subsampling with replacement due to its high computational efficiency.
If the data volume is so large that nonuniform subsampling probabilities cannot
be calculated all at once, then subsampling with replacement is infeasible to
implement. This paper solves this problem using Poisson subsampling. We first
derive optimal Poisson subsampling probabilities in the context of
quasi-likelihood estimation under the A- and L-optimality criteria. For a
practically implementable algorithm with approximated optimal subsampling
probabilities, we establish the consistency and asymptotic normality of the
resultant estimators. To deal with the situation that the full data are stored
in different blocks or at multiple locations, we develop a distributed
subsampling framework, in which statistics are computed simultaneously on
smaller partitions of the full data. Asymptotic properties of the resultant
aggregated estimator are investigated. We illustrate and evaluate the proposed
strategies through numerical experiments on simulated and real data sets.
Related papers
- On diffusion-based generative models and their error bounds: The log-concave case with full convergence estimates [5.13323375365494]
We provide theoretical guarantees for the convergence behaviour of diffusion-based generative models under strongly log-concave data.
Our class of functions used for score estimation is made of Lipschitz continuous functions avoiding any Lipschitzness assumption on the score function.
This approach yields the best known convergence rate for our sampling algorithm.
arXiv Detail & Related papers (2023-11-22T18:40:45Z) - Learning Unnormalized Statistical Models via Compositional Optimization [73.30514599338407]
Noise-contrastive estimation(NCE) has been proposed by formulating the objective as the logistic loss of the real data and the artificial noise.
In this paper, we study it a direct approach for optimizing the negative log-likelihood of unnormalized models.
arXiv Detail & Related papers (2023-06-13T01:18:16Z) - Statistical Efficiency of Score Matching: The View from Isoperimetry [96.65637602827942]
We show a tight connection between statistical efficiency of score matching and the isoperimetric properties of the distribution being estimated.
We formalize these results both in the sample regime and in the finite regime.
arXiv Detail & Related papers (2022-10-03T06:09:01Z) - Selection of the Most Probable Best [2.1095005405219815]
We consider an expected-value ranking and selection (R&S) problem where all k solutions' simulation outputs depend on a common parameter whose uncertainty can be modeled by a distribution.
We define the most probable best (MPB) to be the solution that has the largest probability of being optimal with respect to the distribution.
We devise a series of algorithms that replace the unknown means in the optimality conditions with their estimates and prove the algorithms' sampling ratios achieve the conditions as the simulation budget increases.
arXiv Detail & Related papers (2022-07-15T15:27:27Z) - Near-optimal estimation of smooth transport maps with kernel
sums-of-squares [81.02564078640275]
Under smoothness conditions, the squared Wasserstein distance between two distributions could be efficiently computed with appealing statistical error upper bounds.
The object of interest for applications such as generative modeling is the underlying optimal transport map.
We propose the first tractable algorithm for which the statistical $L2$ error on the maps nearly matches the existing minimax lower-bounds for smooth map estimation.
arXiv Detail & Related papers (2021-12-03T13:45:36Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - Local policy search with Bayesian optimization [73.0364959221845]
Reinforcement learning aims to find an optimal policy by interaction with an environment.
Policy gradients for local search are often obtained from random perturbations.
We develop an algorithm utilizing a probabilistic model of the objective function and its gradient.
arXiv Detail & Related papers (2021-06-22T16:07:02Z) - Estimating leverage scores via rank revealing methods and randomization [50.591267188664666]
We study algorithms for estimating the statistical leverage scores of rectangular dense or sparse matrices of arbitrary rank.
Our approach is based on combining rank revealing methods with compositions of dense and sparse randomized dimensionality reduction transforms.
arXiv Detail & Related papers (2021-05-23T19:21:55Z) - Maximum sampled conditional likelihood for informative subsampling [4.708378681950648]
Subsampling is a computationally effective approach to extract information from massive data sets when computing resources are limited.
We propose to use the maximum maximum conditional likelihood estimator (MSCLE) based on the sampled data.
arXiv Detail & Related papers (2020-11-11T16:01:17Z) - Scalable Distributed Approximation of Internal Measures for Clustering
Evaluation [5.144809478361603]
Internal measure for clustering evaluation is the silhouette coefficient, whose computation requires a quadratic number of distance calculations.
We present the first scalable algorithm to compute such a rigorous approximation for the evaluation of clusterings based on any metric distances.
We also prove that the algorithm can be adapted to obtain rigorous approximations of other internal measures of clustering quality, such as cohesion and separation.
arXiv Detail & Related papers (2020-03-03T10:28:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.