Nearest Neighbor Matching as Least Squares Density Ratio Estimation and Riesz Regression
- URL: http://arxiv.org/abs/2510.24433v1
- Date: Tue, 28 Oct 2025 14:01:51 GMT
- Title: Nearest Neighbor Matching as Least Squares Density Ratio Estimation and Riesz Regression
- Authors: Masahiro Kato,
- Abstract summary: Nearest Neighbor (NN) matching can be interpreted as an instance of Riesz regression for automatic debiased machine learning.<n>Lin et al. (2023) shows that NN matching is an instance of density-ratio estimation with their new density-ratio estimator.<n>Chernozhukov et al. (2024) develops Riesz regression for automatic debiased machine learning.
- Score: 6.44705221140412
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This study proves that Nearest Neighbor (NN) matching can be interpreted as an instance of Riesz regression for automatic debiased machine learning. Lin et al. (2023) shows that NN matching is an instance of density-ratio estimation with their new density-ratio estimator. Chernozhukov et al. (2024) develops Riesz regression for automatic debiased machine learning, which directly estimates the Riesz representer (or equivalently, the bias-correction term) by minimizing the mean squared error. In this study, we first prove that the density-ratio estimation method proposed in Lin et al. (2023) is essentially equivalent to Least-Squares Importance Fitting (LSIF) proposed in Kanamori et al. (2009) for direct density-ratio estimation. Furthermore, we derive Riesz regression using the LSIF framework. Based on these results, we derive NN matching from Riesz regression. This study is based on our work Kato (2025a) and Kato (2025b).
Related papers
- Riesz Representer Fitting under Bregman Divergence: A Unified Framework for Debiased Machine Learning [6.44705221140412]
Estimating the Riesz representer is central to machine learning for causal and structural parameter estimation.<n>We propose a unified framework that estimates the Riesz representer by fitting a representer model via Bregman divergence minimization.
arXiv Detail & Related papers (2026-01-12T17:36:33Z) - Unregularized Linear Convergence in Zero-Sum Game from Preference Feedback [50.89125374999765]
We provide the first convergence guarantee for Optimistic Multiplicative Weights Update ($mathtOMWU$) in NLHF.<n>Our analysis identifies a novel marginal convergence behavior, where the probability of rarely played actions grows exponentially from exponentially small values.
arXiv Detail & Related papers (2025-12-31T12:08:29Z) - Riesz Regression As Direct Density Ratio Estimation [6.44705221140412]
This study shows that Riesz regression is closely related to direct density-ratio estimation (DRE) in important cases.<n>Specifically, the idea and objective in Riesz regression coincide with the one in least-squares importance fitting in DRE estimation.
arXiv Detail & Related papers (2025-11-06T17:25:05Z) - Learning density ratios in causal inference using Bregman-Riesz regression [0.0]
Naively estimating the numerator and denominator densities separately using kernel density estimators can lead to unstable performance.<n>Several methods have been developed for estimating the density ratio directly based on (a) Bregman divergences or (b) recasting the density ratio as the odds.<n>In this paper we show that all three of these methods can be unified in a common framework, which we call Bregman-Riesz regression.
arXiv Detail & Related papers (2025-10-17T18:10:41Z) - DDPM Score Matching and Distribution Learning [24.341062891949953]
Score estimation is the backbone of score-based generative models (SGMs)<n>This paper introduces a framework that reduces score estimation to tasks of parameter and density estimation.<n>We provide minimax rates for density estimation over H" classes and a quasi-polynomial PAC density estimation algorithm.
arXiv Detail & Related papers (2025-04-07T15:07:19Z) - Risk and cross validation in ridge regression with correlated samples [72.59731158970894]
We provide training examples for the in- and out-of-sample risks of ridge regression when the data points have arbitrary correlations.<n>We demonstrate that in this setting, the generalized cross validation estimator (GCV) fails to correctly predict the out-of-sample risk.<n>We further extend our analysis to the case where the test point has nontrivial correlations with the training set, a setting often encountered in time series forecasting.
arXiv Detail & Related papers (2024-08-08T17:27:29Z) - Sample Complexity Bounds for Score-Matching: Causal Discovery and
Generative Modeling [82.36856860383291]
We demonstrate that accurate estimation of the score function is achievable by training a standard deep ReLU neural network.
We establish bounds on the error rate of recovering causal relationships using the score-matching-based causal discovery method.
arXiv Detail & Related papers (2023-10-27T13:09:56Z) - Benign-Overfitting in Conditional Average Treatment Effect Prediction
with Linear Regression [14.493176427999028]
We study the benign overfitting theory in the prediction of the conditional average treatment effect (CATE) with linear regression models.
We show that the T-learner fails to achieve the consistency except the random assignment, while the IPW-learner converges the risk to zero if the propensity score is known.
arXiv Detail & Related papers (2022-02-10T18:51:52Z) - Optimistic Rates: A Unifying Theory for Interpolation Learning and
Regularization in Linear Regression [35.78863301525758]
We study a localized notion of uniform convergence known as an "optimistic rate"
Our refined analysis avoids the hidden constant and logarithmic factor in existing results.
arXiv Detail & Related papers (2021-12-08T18:55:00Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z) - Robust Implicit Networks via Non-Euclidean Contractions [63.91638306025768]
Implicit neural networks show improved accuracy and significant reduction in memory consumption.
They can suffer from ill-posedness and convergence instability.
This paper provides a new framework to design well-posed and robust implicit neural networks.
arXiv Detail & Related papers (2021-06-06T18:05:02Z) - Online nonparametric regression with Sobolev kernels [99.12817345416846]
We derive the regret upper bounds on the classes of Sobolev spaces $W_pbeta(mathcalX)$, $pgeq 2, beta>fracdp$.
The upper bounds are supported by the minimax regret analysis, which reveals that in the cases $beta> fracd2$ or $p=infty$ these rates are (essentially) optimal.
arXiv Detail & Related papers (2021-02-06T15:05:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.