Direct Debiased Machine Learning via Bregman Divergence Minimization
- URL: http://arxiv.org/abs/2510.23534v2
- Date: Thu, 30 Oct 2025 17:55:38 GMT
- Title: Direct Debiased Machine Learning via Bregman Divergence Minimization
- Authors: Masahiro Kato,
- Abstract summary: We develop a direct debiased machine learning framework with an end-to-end algorithm.<n>We formulate estimation of the nuisance parameters, the regression function and the Riesz representer.<n>Neyman targeted estimation includes Riesz representer estimation, and we measure discrepancies using the Bregman divergence.
- Score: 6.44705221140412
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We develop a direct debiased machine learning framework comprising Neyman targeted estimation and generalized Riesz regression. Our framework unifies Riesz regression for automatic debiased machine learning, covariate balancing, targeted maximum likelihood estimation (TMLE), and density-ratio estimation. In many problems involving causal effects or structural models, the parameters of interest depend on regression functions. Plugging regression functions estimated by machine learning methods into the identifying equations can yield poor performance because of first-stage bias. To reduce such bias, debiased machine learning employs Neyman orthogonal estimating equations. Debiased machine learning typically requires estimation of the Riesz representer and the regression function. For this problem, we develop a direct debiased machine learning framework with an end-to-end algorithm. We formulate estimation of the nuisance parameters, the regression function and the Riesz representer, as minimizing the discrepancy between Neyman orthogonal scores computed with known and unknown nuisance parameters, which we refer to as Neyman targeted estimation. Neyman targeted estimation includes Riesz representer estimation, and we measure discrepancies using the Bregman divergence. The Bregman divergence encompasses various loss functions as special cases, where the squared loss yields Riesz regression and the Kullback-Leibler divergence yields entropy balancing. We refer to this Riesz representer estimation as generalized Riesz regression. Neyman targeted estimation also yields TMLE as a special case for regression function estimation. Furthermore, for specific pairs of models and Riesz representer estimation methods, we can automatically obtain the covariate balancing property without explicitly solving the covariate balancing objective.
Related papers
- genriesz: A Python Package for Automatic Debiased Machine Learning with Generalized Riesz Regression [6.44705221140412]
We present genriesz, an open-source Python package that implements automatic DML and generalized Riesz regression.<n>genriesz automatically constructs a compatible link function so that the generalized Riesz regression estimator satisfies balancing (moment-matching) optimality conditions.
arXiv Detail & Related papers (2026-02-19T16:58:40Z) - Riesz Representer Fitting under Bregman Divergence: A Unified Framework for Debiased Machine Learning [6.44705221140412]
Estimating the Riesz representer is central to machine learning for causal and structural parameter estimation.<n>We propose a unified framework that estimates the Riesz representer by fitting a representer model via Bregman divergence minimization.
arXiv Detail & Related papers (2026-01-12T17:36:33Z) - Riesz Regression As Direct Density Ratio Estimation [6.44705221140412]
This study shows that Riesz regression is closely related to direct density-ratio estimation (DRE) in important cases.<n>Specifically, the idea and objective in Riesz regression coincide with the one in least-squares importance fitting in DRE estimation.
arXiv Detail & Related papers (2025-11-06T17:25:05Z) - A Unified Theory for Causal Inference: Direct Debiased Machine Learning via Bregman-Riesz Regression [6.44705221140412]
This note introduces a unified theory for causal inference that integrates Riesz regression, covariate balancing, density-ratio estimation (DRE), and the matching estimator in average treatment effect (ATE) estimation.<n>In ATE estimation, the balancing weights and the regression functions of the outcome play important roles, where the balancing weights are referred to as the Riesz representer.
arXiv Detail & Related papers (2025-10-30T17:56:47Z) - Uncertainty Quantification for Regression using Proper Scoring Rules [76.24649098854219]
We introduce a unified UQ framework for regression based on proper scoring rules, such as CRPS, logarithmic, squared error, and quadratic scores.<n>We derive closed-form expressions for the uncertainty measures under practical parametric assumptions and show how to estimate them using ensembles of models.<n>Our broad evaluation on synthetic and real-world regression datasets provides guidance for selecting reliable UQ measures.
arXiv Detail & Related papers (2025-09-30T17:52:12Z) - Model-free Online Learning for the Kalman Filter: Forgetting Factor and Logarithmic Regret [2.313314525234138]
We consider the problem of online prediction for an unknown, non-explosive linear system.<n>With a known system model, the optimal predictor is the celebrated Kalman filter.<n>We tackle this problem by injecting an inductive bias into the regression model via exponential forgetting
arXiv Detail & Related papers (2025-05-13T21:49:56Z) - RieszBoost: Gradient Boosting for Riesz Regression [49.737777802061984]
We propose a novel gradient boosting algorithm to directly estimate the Riesz representer without requiring its explicit analytical form.<n>We show that our algorithm performs on par with or better than indirect estimation techniques across a range of functionals.
arXiv Detail & Related papers (2025-01-08T23:04:32Z) - Automatic debiasing of neural networks via moment-constrained learning [0.0]
Naively learning the regression function and taking a sample mean of the target functional results in biased estimators.<n>We propose moment-constrained learning as a new RR learning approach that addresses some shortcomings in automatic debiasing.
arXiv Detail & Related papers (2024-09-29T20:56:54Z) - Beyond the Norms: Detecting Prediction Errors in Regression Models [26.178065248948773]
This paper tackles the challenge of detecting unreliable behavior in regression algorithms.
We introduce the notion of unreliability in regression, when the output of the regressor exceeds a specified discrepancy (or error)
We show empirical improvements in error detection for multiple regression tasks, consistently outperforming popular baseline approaches.
arXiv Detail & Related papers (2024-06-11T05:51:44Z) - Value-Distributional Model-Based Reinforcement Learning [59.758009422067]
Quantifying uncertainty about a policy's long-term performance is important to solve sequential decision-making tasks.
We study the problem from a model-based Bayesian reinforcement learning perspective.
We propose Epistemic Quantile-Regression (EQR), a model-based algorithm that learns a value distribution function.
arXiv Detail & Related papers (2023-08-12T14:59:19Z) - Transfer Learning with Random Coefficient Ridge Regression [2.0813318162800707]
Ridge regression with random coefficients provides an important alternative to fixed coefficients regression in high dimensional setting.
This paper considers estimation and prediction of random coefficient ridge regression in the setting of transfer learning.
arXiv Detail & Related papers (2023-06-28T04:36:37Z) - Learning Dynamical Systems via Koopman Operator Regression in
Reproducing Kernel Hilbert Spaces [52.35063796758121]
We formalize a framework to learn the Koopman operator from finite data trajectories of the dynamical system.
We link the risk with the estimation of the spectral decomposition of the Koopman operator.
Our results suggest RRR might be beneficial over other widely used estimators.
arXiv Detail & Related papers (2022-05-27T14:57:48Z) - Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing
Regressions In NLP Model Updates [68.09049111171862]
This work focuses on quantifying, reducing and analyzing regression errors in the NLP model updates.
We formulate the regression-free model updates into a constrained optimization problem.
We empirically analyze how model ensemble reduces regression.
arXiv Detail & Related papers (2021-05-07T03:33:00Z) - Online nonparametric regression with Sobolev kernels [99.12817345416846]
We derive the regret upper bounds on the classes of Sobolev spaces $W_pbeta(mathcalX)$, $pgeq 2, beta>fracdp$.
The upper bounds are supported by the minimax regret analysis, which reveals that in the cases $beta> fracd2$ or $p=infty$ these rates are (essentially) optimal.
arXiv Detail & Related papers (2021-02-06T15:05:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.