A Unified Theory for Causal Inference: Direct Debiased Machine Learning via Bregman-Riesz Regression
- URL: http://arxiv.org/abs/2510.26783v1
- Date: Thu, 30 Oct 2025 17:56:47 GMT
- Title: A Unified Theory for Causal Inference: Direct Debiased Machine Learning via Bregman-Riesz Regression
- Authors: Masahiro Kato,
- Abstract summary: This note introduces a unified theory for causal inference that integrates Riesz regression, covariate balancing, density-ratio estimation (DRE), and the matching estimator in average treatment effect (ATE) estimation.<n>In ATE estimation, the balancing weights and the regression functions of the outcome play important roles, where the balancing weights are referred to as the Riesz representer.
- Score: 6.44705221140412
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This note introduces a unified theory for causal inference that integrates Riesz regression, covariate balancing, density-ratio estimation (DRE), targeted maximum likelihood estimation (TMLE), and the matching estimator in average treatment effect (ATE) estimation. In ATE estimation, the balancing weights and the regression functions of the outcome play important roles, where the balancing weights are referred to as the Riesz representer, bias-correction term, and clever covariates, depending on the context. Riesz regression, covariate balancing, DRE, and the matching estimator are methods for estimating the balancing weights, where Riesz regression is essentially equivalent to DRE in the ATE context, the matching estimator is a special case of DRE, and DRE is in a dual relationship with covariate balancing. TMLE is a method for constructing regression function estimators such that the leading bias term becomes zero. Nearest Neighbor Matching is equivalent to Least Squares Density Ratio Estimation and Riesz Regression.
Related papers
- Riesz Representer Fitting under Bregman Divergence: A Unified Framework for Debiased Machine Learning [6.44705221140412]
Estimating the Riesz representer is central to machine learning for causal and structural parameter estimation.<n>We propose a unified framework that estimates the Riesz representer by fitting a representer model via Bregman divergence minimization.
arXiv Detail & Related papers (2026-01-12T17:36:33Z) - ScoreMatchingRiesz: Auto-DML with Infinitesimal Classification [6.44705221140412]
The Riesz representer is a key component in machine learning for constructing $sqrtn$-consistent and efficient estimators.<n>We extend score-matching-based DRE methods to Riesz representer estimation.
arXiv Detail & Related papers (2025-12-23T17:14:14Z) - Riesz Regression As Direct Density Ratio Estimation [6.44705221140412]
This study shows that Riesz regression is closely related to direct density-ratio estimation (DRE) in important cases.<n>Specifically, the idea and objective in Riesz regression coincide with the one in least-squares importance fitting in DRE estimation.
arXiv Detail & Related papers (2025-11-06T17:25:05Z) - Direct Debiased Machine Learning via Bregman Divergence Minimization [6.44705221140412]
We develop a direct debiased machine learning framework with an end-to-end algorithm.<n>We formulate estimation of the nuisance parameters, the regression function and the Riesz representer.<n>Neyman targeted estimation includes Riesz representer estimation, and we measure discrepancies using the Bregman divergence.
arXiv Detail & Related papers (2025-10-27T17:10:43Z) - Risk and cross validation in ridge regression with correlated samples [72.59731158970894]
We provide training examples for the in- and out-of-sample risks of ridge regression when the data points have arbitrary correlations.<n>We demonstrate that in this setting, the generalized cross validation estimator (GCV) fails to correctly predict the out-of-sample risk.<n>We further extend our analysis to the case where the test point has nontrivial correlations with the training set, a setting often encountered in time series forecasting.
arXiv Detail & Related papers (2024-08-08T17:27:29Z) - Multivariate root-n-consistent smoothing parameter free matching estimators and estimators of inverse density weighted expectations [51.000851088730684]
We develop novel modifications of nearest-neighbor and matching estimators which converge at the parametric $sqrt n $-rate.<n>We stress that our estimators do not involve nonparametric function estimators and in particular do not rely on sample-size dependent parameters smoothing.
arXiv Detail & Related papers (2024-07-11T13:28:34Z) - Relaxed Quantile Regression: Prediction Intervals for Asymmetric Noise [51.87307904567702]
Quantile regression is a leading approach for obtaining such intervals via the empirical estimation of quantiles in the distribution of outputs.<n>We propose Relaxed Quantile Regression (RQR), a direct alternative to quantile regression based interval construction that removes this arbitrary constraint.<n>We demonstrate that this added flexibility results in intervals with an improvement in desirable qualities.
arXiv Detail & Related papers (2024-06-05T13:36:38Z) - TIC-TAC: A Framework for Improved Covariance Estimation in Deep Heteroscedastic Regression [109.69084997173196]
Deepscedastic regression involves jointly optimizing the mean and covariance of the predicted distribution using the negative log-likelihood.
Recent works show that this may result in sub-optimal convergence due to the challenges associated with covariance estimation.
We study two questions: (1) Does the predicted covariance truly capture the randomness of the predicted mean?
Our results show that not only does TIC accurately learn the covariance, it additionally facilitates an improved convergence of the negative log-likelihood.
arXiv Detail & Related papers (2023-10-29T09:54:03Z) - Double Debiased Covariate Shift Adaptation Robust to Density-Ratio Estimation [7.8856737627153874]
We propose a doubly robust estimator for covariate shift adaptation via importance weighting.
Our estimator reduces the bias arising from the density ratio estimation errors.
Notably, our estimator remains consistent if either the density ratio estimator or the regression function is consistent.
arXiv Detail & Related papers (2023-10-25T13:38:29Z) - SLOE: A Faster Method for Statistical Inference in High-Dimensional
Logistic Regression [68.66245730450915]
We develop an improved method for debiasing predictions and estimating frequentist uncertainty for practical datasets.
Our main contribution is SLOE, an estimator of the signal strength with convergence guarantees that reduces the computation time of estimation and inference by orders of magnitude.
arXiv Detail & Related papers (2021-03-23T17:48:56Z) - Online nonparametric regression with Sobolev kernels [99.12817345416846]
We derive the regret upper bounds on the classes of Sobolev spaces $W_pbeta(mathcalX)$, $pgeq 2, beta>fracdp$.
The upper bounds are supported by the minimax regret analysis, which reveals that in the cases $beta> fracd2$ or $p=infty$ these rates are (essentially) optimal.
arXiv Detail & Related papers (2021-02-06T15:05:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.