Estimation Beyond Data Reweighting: Kernel Method of Moments
- URL: http://arxiv.org/abs/2305.10898v2
- Date: Tue, 13 Jun 2023 12:35:33 GMT
- Title: Estimation Beyond Data Reweighting: Kernel Method of Moments
- Authors: Heiner Kremer, Yassine Nemmour, Bernhard Sch\"olkopf, Jia-Jie Zhu
- Abstract summary: We provide an empirical likelihood estimator based on maximum mean discrepancy which we term the kernel method of moments (KMM)
We show that our method achieves competitive performance on several conditional moment restriction tasks.
- Score: 9.845144212844662
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Moment restrictions and their conditional counterparts emerge in many areas
of machine learning and statistics ranging from causal inference to
reinforcement learning. Estimators for these tasks, generally called methods of
moments, include the prominent generalized method of moments (GMM) which has
recently gained attention in causal inference. GMM is a special case of the
broader family of empirical likelihood estimators which are based on
approximating a population distribution by means of minimizing a
$\varphi$-divergence to an empirical distribution. However, the use of
$\varphi$-divergences effectively limits the candidate distributions to
reweightings of the data samples. We lift this long-standing limitation and
provide a method of moments that goes beyond data reweighting. This is achieved
by defining an empirical likelihood estimator based on maximum mean discrepancy
which we term the kernel method of moments (KMM). We provide a variant of our
estimator for conditional moment restrictions and show that it is
asymptotically first-order optimal for such problems. Finally, we show that our
method achieves competitive performance on several conditional moment
restriction tasks.
Related papers
- DistPred: A Distribution-Free Probabilistic Inference Method for Regression and Forecasting [14.390842560217743]
We propose a novel approach called DistPred for regression and forecasting tasks.
We transform proper scoring rules that measure the discrepancy between the predicted distribution and the target distribution into a differentiable discrete form.
This allows the model to sample numerous samples in a single forward pass to estimate the potential distribution of the response variable.
arXiv Detail & Related papers (2024-06-17T10:33:00Z) - Relaxed Quantile Regression: Prediction Intervals for Asymmetric Noise [51.87307904567702]
Quantile regression is a leading approach for obtaining such intervals via the empirical estimation of quantiles in the distribution of outputs.
We propose Relaxed Quantile Regression (RQR), a direct alternative to quantile regression based interval construction that removes this arbitrary constraint.
We demonstrate that this added flexibility results in intervals with an improvement in desirable qualities.
arXiv Detail & Related papers (2024-06-05T13:36:38Z) - Pseudo-Observations and Super Learner for the Estimation of the Restricted Mean Survival Time [0.0]
We propose a flexible and easy-to-use ensemble algorithm that combines pseudo-observations and super learner.
We complement the predictions obtained from our method with our RMST-adapted risk measure, prediction intervals and variable importance measures.
arXiv Detail & Related papers (2024-04-26T07:38:10Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z) - Near-optimal inference in adaptive linear regression [60.08422051718195]
Even simple methods like least squares can exhibit non-normal behavior when data is collected in an adaptive manner.
We propose a family of online debiasing estimators to correct these distributional anomalies in at least squares estimation.
We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.
arXiv Detail & Related papers (2021-07-05T21:05:11Z) - Minimax Off-Policy Evaluation for Multi-Armed Bandits [58.7013651350436]
We study the problem of off-policy evaluation in the multi-armed bandit model with bounded rewards.
We develop minimax rate-optimal procedures under three settings.
arXiv Detail & Related papers (2021-01-19T18:55:29Z) - The Variational Method of Moments [65.91730154730905]
conditional moment problem is a powerful formulation for describing structural causal parameters in terms of observables.
Motivated by a variational minimax reformulation of OWGMM, we define a very general class of estimators for the conditional moment problem.
We provide algorithms for valid statistical inference based on the same kind of variational reformulations.
arXiv Detail & Related papers (2020-12-17T07:21:06Z) - Asymptotics of the Empirical Bootstrap Method Beyond Asymptotic
Normality [25.402400996745058]
We show that the limiting distribution of the empirical bootstrap estimator is consistent under stability conditions.
We propose three alternative ways to use the bootstrap method to build confidence intervals with coverage guarantees.
arXiv Detail & Related papers (2020-11-23T07:14:30Z) - Optimal Off-Policy Evaluation from Multiple Logging Policies [77.62012545592233]
We study off-policy evaluation from multiple logging policies, each generating a dataset of fixed size, i.e., stratified sampling.
We find the OPE estimator for multiple loggers with minimum variance for any instance, i.e., the efficient one.
arXiv Detail & Related papers (2020-10-21T13:43:48Z) - Distributionally Robust Parametric Maximum Likelihood Estimation [13.09499764232737]
We propose a distributionally robust maximum likelihood estimator that minimizes the worst-case expected log-loss uniformly over a parametric nominal distribution.
Our novel robust estimator also enjoys statistical consistency and delivers promising empirical results in both regression and classification tasks.
arXiv Detail & Related papers (2020-10-11T19:05:49Z) - Robust subgaussian estimation with VC-dimension [0.0]
This work proposes a new general way to bound the excess risk for MOM estimators.
The core technique is the use of VC-dimension (instead of Rademacher complexity) to measure the statistical complexity.
arXiv Detail & Related papers (2020-04-24T13:21:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.