Isotonic regression with unknown permutations: Statistics, computation,
and adaptation
- URL: http://arxiv.org/abs/2009.02609v2
- Date: Thu, 24 Jun 2021 13:58:37 GMT
- Title: Isotonic regression with unknown permutations: Statistics, computation,
and adaptation
- Authors: Ashwin Pananjady, Richard J. Samworth
- Abstract summary: We study the minimax risk of estimation (in empirical $L$ loss) and the fundamental limits of adaptation (quantified by the adaptivity index)
We provide a Mirsky partition estimator that is minimax optimal while also achieving the smallest adaptivity index possible for vanilla time procedures.
In a complementary direction, we show that natural modifications of existing estimators fail to satisfy at least one of the desiderata optimal worst-case statistical performance, computational efficiency, and fast adaptation.
- Score: 13.96377843988598
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Motivated by models for multiway comparison data, we consider the problem of
estimating a coordinate-wise isotonic function on the domain $[0, 1]^d$ from
noisy observations collected on a uniform lattice, but where the design points
have been permuted along each dimension. While the univariate and bivariate
versions of this problem have received significant attention, our focus is on
the multivariate case $d \geq 3$. We study both the minimax risk of estimation
(in empirical $L_2$ loss) and the fundamental limits of adaptation (quantified
by the adaptivity index) to a family of piecewise constant functions. We
provide a computationally efficient Mirsky partition estimator that is minimax
optimal while also achieving the smallest adaptivity index possible for
polynomial time procedures. Thus, from a worst-case perspective and in sharp
contrast to the bivariate case, the latent permutations in the model do not
introduce significant computational difficulties over and above vanilla
isotonic regression. On the other hand, the fundamental limits of adaptation
are significantly different with and without unknown permutations: Assuming a
hardness conjecture from average-case complexity theory, a
statistical-computational gap manifests in the former case. In a complementary
direction, we show that natural modifications of existing estimators fail to
satisfy at least one of the desiderata of optimal worst-case statistical
performance, computational efficiency, and fast adaptation. Along the way to
showing our results, we improve adaptation results in the special case $d = 2$
and establish some properties of estimators for vanilla isotonic regression,
both of which may be of independent interest.
Related papers
- Statistical Inference in Classification of High-dimensional Gaussian Mixture [1.2354076490479515]
We investigate the behavior of a general class of regularized convex classifiers in the high-dimensional limit.
Our focus is on the generalization error and variable selection properties of the estimators.
arXiv Detail & Related papers (2024-10-25T19:58:36Z) - Multivariate root-n-consistent smoothing parameter free matching estimators and estimators of inverse density weighted expectations [51.000851088730684]
We develop novel modifications of nearest-neighbor and matching estimators which converge at the parametric $sqrt n $-rate.
We stress that our estimators do not involve nonparametric function estimators and in particular do not rely on sample-size dependent parameters smoothing.
arXiv Detail & Related papers (2024-07-11T13:28:34Z) - Nearest Neighbor Sampling for Covariate Shift Adaptation [7.940293148084844]
We propose a new covariate shift adaptation method which avoids estimating the weights.
The basic idea is to directly work on unlabeled target data, labeled according to the $k$-nearest neighbors in the source dataset.
Our experiments show that it achieves drastic reduction in the running time with remarkable accuracy.
arXiv Detail & Related papers (2023-12-15T17:28:09Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - Learning Unnormalized Statistical Models via Compositional Optimization [73.30514599338407]
Noise-contrastive estimation(NCE) has been proposed by formulating the objective as the logistic loss of the real data and the artificial noise.
In this paper, we study it a direct approach for optimizing the negative log-likelihood of unnormalized models.
arXiv Detail & Related papers (2023-06-13T01:18:16Z) - Adaptive LASSO estimation for functional hidden dynamic geostatistical
model [69.10717733870575]
We propose a novel model selection algorithm based on a penalized maximum likelihood estimator (PMLE) for functional hiddenstatistical models (f-HD)
The algorithm is based on iterative optimisation and uses an adaptive least absolute shrinkage and selector operator (GMSOLAS) penalty function, wherein the weights are obtained by the unpenalised f-HD maximum-likelihood estimators.
arXiv Detail & Related papers (2022-08-10T19:17:45Z) - Instance-optimality in optimal value estimation: Adaptivity via
variance-reduced Q-learning [99.34907092347733]
We analyze the problem of estimating optimal $Q$-value functions for a discounted Markov decision process with discrete states and actions.
Using a local minimax framework, we show that this functional arises in lower bounds on the accuracy on any estimation procedure.
In the other direction, we establish the sharpness of our lower bounds, up to factors logarithmic in the state and action spaces, by analyzing a variance-reduced version of $Q$-learning.
arXiv Detail & Related papers (2021-06-28T00:38:54Z) - Instability, Computational Efficiency and Statistical Accuracy [101.32305022521024]
We develop a framework that yields statistical accuracy based on interplay between the deterministic convergence rate of the algorithm at the population level, and its degree of (instability) when applied to an empirical object based on $n$ samples.
We provide applications of our general results to several concrete classes of models, including Gaussian mixture estimation, non-linear regression models, and informative non-response models.
arXiv Detail & Related papers (2020-05-22T22:30:52Z) - Amortized variance reduction for doubly stochastic objectives [17.064916635597417]
Approximate inference in complex probabilistic models requires optimisation of doubly objective functions.
Current approaches do not take into account how mini-batchity affects samplingity, resulting in sub-optimal variance reduction.
We propose a new approach in which we use a recognition network to cheaply approximate the optimal control variate for each mini-batch, with no additional gradient computations.
arXiv Detail & Related papers (2020-03-09T13:23:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.