Related papers: Riesz Representer Fitting under Bregman Divergence: A Unified Framework for Debiased Machine Learning

Riesz Representer Fitting under Bregman Divergence: A Unified Framework for Debiased Machine Learning

URL: http://arxiv.org/abs/2601.07752v2
Date: Thu, 15 Jan 2026 17:55:37 GMT
Title: Riesz Representer Fitting under Bregman Divergence: A Unified Framework for Debiased Machine Learning
Authors: Masahiro Kato,
Abstract summary: Estimating the Riesz representer is central to machine learning for causal and structural parameter estimation.<n>We propose a unified framework that estimates the Riesz representer by fitting a representer model via Bregman divergence minimization.
Score: 6.44705221140412
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Estimating the Riesz representer is central to debiased machine learning for causal and structural parameter estimation. We propose generalized Riesz regression, a unified framework that estimates the Riesz representer by fitting a representer model via Bregman divergence minimization. This framework includes the squared loss and the Kullback--Leibler (KL) divergence as special cases: the former recovers Riesz regression, while the latter recovers tailored loss minimization. Under suitable model specifications, the dual problems correspond to covariate balancing, which we call automatic covariate balancing. Moreover, under the same specifications, outcome averages weighted by the estimated Riesz representer satisfy Neyman orthogonality even without estimating the regression function, a property we call automatic Neyman orthogonalization. This property not only reduces the estimation error of Neyman orthogonal scores but also clarifies a key distinction between debiased machine learning and targeted maximum likelihood estimation. Our framework can also be viewed as a generalization of density ratio fitting under Bregman divergences to Riesz representer estimation, and it applies beyond density ratio estimation. We provide convergence analyses for both reproducing kernel Hilbert space (RKHS) and neural network model classes. A Python package for generalized Riesz regression is available at https://github.com/MasaKat0/grr.

Related papers

genriesz: A Python Package for Automatic Debiased Machine Learning with Generalized Riesz Regression [6.44705221140412]
We present genriesz, an open-source Python package that implements automatic DML and generalized Riesz regression.<n>genriesz automatically constructs a compatible link function so that the generalized Riesz regression estimator satisfies balancing (moment-matching) optimality conditions.
arXiv Detail & Related papers (2026-02-19T16:58:40Z)
Riesz Regression As Direct Density Ratio Estimation [6.44705221140412]
This study shows that Riesz regression is closely related to direct density-ratio estimation (DRE) in important cases.<n>Specifically, the idea and objective in Riesz regression coincide with the one in least-squares importance fitting in DRE estimation.
arXiv Detail & Related papers (2025-11-06T17:25:05Z)
A Unified Theory for Causal Inference: Direct Debiased Machine Learning via Bregman-Riesz Regression [6.44705221140412]
This note introduces a unified theory for causal inference that integrates Riesz regression, covariate balancing, density-ratio estimation (DRE), and the matching estimator in average treatment effect (ATE) estimation.<n>In ATE estimation, the balancing weights and the regression functions of the outcome play important roles, where the balancing weights are referred to as the Riesz representer.
arXiv Detail & Related papers (2025-10-30T17:56:47Z)
Direct Debiased Machine Learning via Bregman Divergence Minimization [6.44705221140412]
We develop a direct debiased machine learning framework with an end-to-end algorithm.<n>We formulate estimation of the nuisance parameters, the regression function and the Riesz representer.<n>Neyman targeted estimation includes Riesz representer estimation, and we measure discrepancies using the Bregman divergence.
arXiv Detail & Related papers (2025-10-27T17:10:43Z)
Bayesian Double Descent [0.0]
We show that deep neural networks have a re-descending property in their risk function.<n>As the complexity of the model increases, risk exhibits a U-shaped region.<n>As the number of parameters equals the number of observations and the model becomes one of where the risk can be unbounded, it re-descends.
arXiv Detail & Related papers (2025-07-09T23:47:26Z)
Generalization Bounds of Surrogate Policies for Combinatorial Optimization Problems [53.03951222945921]
We analyze smoothed (perturbed) policies, adding controlled random perturbations to the direction used by the linear oracle.<n>Our main contribution is a generalization bound that decomposes the excess risk into perturbation bias, statistical estimation error, and optimization error.<n>We illustrate the scope of the results on applications such as vehicle scheduling, highlighting how smoothing enables both tractable training and controlled generalization.
arXiv Detail & Related papers (2024-07-24T12:00:30Z)
Multivariate root-n-consistent smoothing parameter free matching estimators and estimators of inverse density weighted expectations [51.000851088730684]
We develop novel modifications of nearest-neighbor and matching estimators which converge at the parametric $sqrt n $-rate.<n>We stress that our estimators do not involve nonparametric function estimators and in particular do not rely on sample-size dependent parameters smoothing.
arXiv Detail & Related papers (2024-07-11T13:28:34Z)
Relaxed Quantile Regression: Prediction Intervals for Asymmetric Noise [51.87307904567702]
Quantile regression is a leading approach for obtaining such intervals via the empirical estimation of quantiles in the distribution of outputs.<n>We propose Relaxed Quantile Regression (RQR), a direct alternative to quantile regression based interval construction that removes this arbitrary constraint.<n>We demonstrate that this added flexibility results in intervals with an improvement in desirable qualities.
arXiv Detail & Related papers (2024-06-05T13:36:38Z)
Policy Evaluation in Distributional LQR [70.63903506291383]
We provide a closed-form expression of the distribution of the random return. We show that this distribution can be approximated by a finite number of random variables. Using the approximate return distribution, we propose a zeroth-order policy gradient algorithm for risk-averse LQR.
arXiv Detail & Related papers (2023-03-23T20:27:40Z)
Learning Dynamical Systems via Koopman Operator Regression in Reproducing Kernel Hilbert Spaces [52.35063796758121]
We formalize a framework to learn the Koopman operator from finite data trajectories of the dynamical system. We link the risk with the estimation of the spectral decomposition of the Koopman operator. Our results suggest RRR might be beneficial over other widely used estimators.
arXiv Detail & Related papers (2022-05-27T14:57:48Z)
Optimally tackling covariate shift in RKHS-based nonparametric regression [43.457497490211985]
We show that a kernel ridge regression estimator with a carefully chosen regularization parameter is minimax rate-optimal. We also show that a naive estimator, which minimizes the empirical risk over the function class, is strictly sub-optimal. We propose a reweighted KRR estimator that weights samples based on a careful truncation of the likelihood ratios.
arXiv Detail & Related papers (2022-05-06T02:33:24Z)
Random Forest Weighted Local Fréchet Regression with Random Objects [18.128663071848923]
We propose a novel random forest weighted local Fr'echet regression paradigm.<n>Our first method uses these weights as the local average to solve the conditional Fr'echet mean.<n>Second method performs local linear Fr'echet regression, both significantly improving existing Fr'echet regression methods.
arXiv Detail & Related papers (2022-02-10T09:10:59Z)
Optimal variance-reduced stochastic approximation in Banach spaces [114.8734960258221]
We study the problem of estimating the fixed point of a contractive operator defined on a separable Banach space. We establish non-asymptotic bounds for both the operator defect and the estimation error.
arXiv Detail & Related papers (2022-01-21T02:46:57Z)
Online nonparametric regression with Sobolev kernels [99.12817345416846]
We derive the regret upper bounds on the classes of Sobolev spaces $W_pbeta(mathcalX)$, $pgeq 2, beta>fracdp$. The upper bounds are supported by the minimax regret analysis, which reveals that in the cases $beta> fracd2$ or $p=infty$ these rates are (essentially) optimal.
arXiv Detail & Related papers (2021-02-06T15:05:14Z)
Sharp Statistical Guarantees for Adversarially Robust Gaussian Classification [54.22421582955454]
We provide the first result of the optimal minimax guarantees for the excess risk for adversarially robust classification. Results are stated in terms of the Adversarial Signal-to-Noise Ratio (AdvSNR), which generalizes a similar notion for standard linear classification to the adversarial setting.
arXiv Detail & Related papers (2020-06-29T21:06:52Z)
Preventing Posterior Collapse with Levenshtein Variational Autoencoder [61.30283661804425]
We propose to replace the evidence lower bound (ELBO) with a new objective which is simple to optimize and prevents posterior collapse. We show that Levenstein VAE produces more informative latent representations than alternative approaches to preventing posterior collapse.
arXiv Detail & Related papers (2020-04-30T13:27:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.