Distributed Nonparametric Estimation: from Sparse to Dense Samples per Terminal
- URL: http://arxiv.org/abs/2501.07879v1
- Date: Tue, 14 Jan 2025 06:41:55 GMT
- Title: Distributed Nonparametric Estimation: from Sparse to Dense Samples per Terminal
- Authors: Deheng Yuan, Tao Guo, Zhongyi Huang,
- Abstract summary: We characterize the minimax optimal rates for all regimes, and identify phase transitions of the optimal rates as the samples per terminal vary from sparse to dense.
This fully solves the problem left open by previous works, whose scopes are limited to regimes with either dense samples or a single sample per terminal.
The optimal rates are immediate for various special cases such as density estimation, Gaussian, binary, Poisson and heteroskedastic regression models.
- Score: 9.766173684831324
- License:
- Abstract: Consider the communication-constrained problem of nonparametric function estimation, in which each distributed terminal holds multiple i.i.d. samples. Under certain regularity assumptions, we characterize the minimax optimal rates for all regimes, and identify phase transitions of the optimal rates as the samples per terminal vary from sparse to dense. This fully solves the problem left open by previous works, whose scopes are limited to regimes with either dense samples or a single sample per terminal. To achieve the optimal rates, we design a layered estimation protocol by exploiting protocols for the parametric density estimation problem. We show the optimality of the protocol using information-theoretic methods and strong data processing inequalities, and incorporating the classic balls and bins model. The optimal rates are immediate for various special cases such as density estimation, Gaussian, binary, Poisson and heteroskedastic regression models.
Related papers
- Conditional simulation via entropic optimal transport: Toward non-parametric estimation of conditional Brenier maps [13.355769319031184]
Conditional simulation is a fundamental task in statistical modeling.
One promising approach is to construct conditional Brenier maps, where the components of the map pushforward a reference distribution to conditionals of the target.
We propose a non-parametric estimator for conditional Brenier maps based on the computational scalability of emphentropic optimal transport.
arXiv Detail & Related papers (2024-11-11T17:32:47Z) - Adaptive Refinement Protocols for Distributed Distribution Estimation under $\ell^p$-Losses [9.766173684831324]
Consider the communication-constrained estimation of discrete distributions under $ellp$ losses.
We obtain the minimax optimal rates of the problem in most parameter regimes.
arXiv Detail & Related papers (2024-10-09T13:46:08Z) - Multivariate root-n-consistent smoothing parameter free matching estimators and estimators of inverse density weighted expectations [51.000851088730684]
We develop novel modifications of nearest-neighbor and matching estimators which converge at the parametric $sqrt n $-rate.
We stress that our estimators do not involve nonparametric function estimators and in particular do not rely on sample-size dependent parameters smoothing.
arXiv Detail & Related papers (2024-07-11T13:28:34Z) - Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - Nearest Neighbor Sampling for Covariate Shift Adaptation [7.940293148084844]
We propose a new covariate shift adaptation method which avoids estimating the weights.
The basic idea is to directly work on unlabeled target data, labeled according to the $k$-nearest neighbors in the source dataset.
Our experiments show that it achieves drastic reduction in the running time with remarkable accuracy.
arXiv Detail & Related papers (2023-12-15T17:28:09Z) - On diffusion-based generative models and their error bounds: The log-concave case with full convergence estimates [5.13323375365494]
We provide theoretical guarantees for the convergence behaviour of diffusion-based generative models under strongly log-concave data.
Our class of functions used for score estimation is made of Lipschitz continuous functions avoiding any Lipschitzness assumption on the score function.
This approach yields the best known convergence rate for our sampling algorithm.
arXiv Detail & Related papers (2023-11-22T18:40:45Z) - Optimization of Annealed Importance Sampling Hyperparameters [77.34726150561087]
Annealed Importance Sampling (AIS) is a popular algorithm used to estimates the intractable marginal likelihood of deep generative models.
We present a parameteric AIS process with flexible intermediary distributions and optimize the bridging distributions to use fewer number of steps for sampling.
We assess the performance of our optimized AIS for marginal likelihood estimation of deep generative models and compare it to other estimators.
arXiv Detail & Related papers (2022-09-27T07:58:25Z) - Distributed Sketching for Randomized Optimization: Exact
Characterization, Concentration and Lower Bounds [54.51566432934556]
We consider distributed optimization methods for problems where forming the Hessian is computationally challenging.
We leverage randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in asynchronous distributed systems.
arXiv Detail & Related papers (2022-03-18T05:49:13Z) - A Non-Classical Parameterization for Density Estimation Using Sample
Moments [0.0]
We propose a non-classical parametrization for density estimation using sample moments.
The proposed estimator is the first one in the literature for which the power moments up to an arbitrary even order exactly match the sample moments.
arXiv Detail & Related papers (2022-01-13T04:28:52Z) - A Unified Framework for Multi-distribution Density Ratio Estimation [101.67420298343512]
Binary density ratio estimation (DRE) provides the foundation for many state-of-the-art machine learning algorithms.
We develop a general framework from the perspective of Bregman minimization divergence.
We show that our framework leads to methods that strictly generalize their counterparts in binary DRE.
arXiv Detail & Related papers (2021-12-07T01:23:20Z) - Variational Refinement for Importance Sampling Using the Forward
Kullback-Leibler Divergence [77.06203118175335]
Variational Inference (VI) is a popular alternative to exact sampling in Bayesian inference.
Importance sampling (IS) is often used to fine-tune and de-bias the estimates of approximate Bayesian inference procedures.
We propose a novel combination of optimization and sampling techniques for approximate Bayesian inference.
arXiv Detail & Related papers (2021-06-30T11:00:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.