Nonparametric Additive Value Functions: Interpretable Reinforcement Learning with an Application to Surgical Recovery
- URL: http://arxiv.org/abs/2308.13135v2
- Date: Fri, 30 May 2025 14:29:27 GMT
- Title: Nonparametric Additive Value Functions: Interpretable Reinforcement Learning with an Application to Surgical Recovery
- Authors: Patrick Emedom-Nnamdi, Timothy R. Smith, Jukka-Pekka Onnela, Junwei Lu,
- Abstract summary: We propose a nonparametric additive model for estimating interpretable value functions in reinforcement learning.<n>This method bridges the gap between flexible machine learning techniques and the interpretability required in healthcare applications.
- Score: 8.890206493793878
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a nonparametric additive model for estimating interpretable value functions in reinforcement learning, with an application in optimizing postoperative recovery through personalized, adaptive recommendations. While reinforcement learning has achieved significant success in various domains, recent methods often rely on black-box approaches such as neural networks, which hinder the examination of individual feature contributions to a decision-making policy. Our novel method offers a flexible technique for estimating action-value functions without explicit parametric assumptions, overcoming the limitations of the linearity assumption of classical algorithms. By incorporating local kernel regression and basis expansion, we obtain a sparse, additive representation of the action-value function, enabling local approximation and retrieval of nonlinear, independent contributions of select state features and the interactions between joint feature pairs. We validate our approach through a simulation study and apply it to spine disease recovery, uncovering recommendations aligned with clinical knowledge. This method bridges the gap between flexible machine learning techniques and the interpretability required in healthcare applications, paving the way for more personalized interventions.
Related papers
- Double Debiased Machine Learning for Mediation Analysis with Continuous Treatments [38.70412001488559]
We propose a machine learning algorithm for mediation analysis that supports continuous treatments.
We provide a numerical evaluation of our approach on a simulation along with an application to real-world medical data to analyze the effect of glycemic control on cognitive functions.
arXiv Detail & Related papers (2025-03-08T10:46:47Z) - RieszBoost: Gradient Boosting for Riesz Regression [49.737777802061984]
We propose a novel gradient boosting algorithm to directly estimate the Riesz representer without requiring its explicit analytical form.
We show that our algorithm performs on par with or better than indirect estimation techniques across a range of functionals.
arXiv Detail & Related papers (2025-01-08T23:04:32Z) - Flexible Nonparametric Inference for Causal Effects under the Front-Door Model [2.6900047294457683]
We develop novel one-step and targeted minimum loss-based estimators for both the average treatment effect and the average treatment effect on the treated under front-door assumptions.<n>Our estimators are built on multiple parameterizations of the observed data distribution, including approaches that avoid mediator density entirely.<n>We show how these constraints can be leveraged to improve the efficiency of causal effect estimators.
arXiv Detail & Related papers (2023-12-15T22:04:53Z) - Statistically Efficient Variance Reduction with Double Policy Estimation
for Off-Policy Evaluation in Sequence-Modeled Reinforcement Learning [53.97273491846883]
We propose DPE: an RL algorithm that blends offline sequence modeling and offline reinforcement learning with Double Policy Estimation.
We validate our method in multiple tasks of OpenAI Gym with D4RL benchmarks.
arXiv Detail & Related papers (2023-08-28T20:46:07Z) - Offline Reinforcement Learning with Differentiable Function
Approximation is Provably Efficient [65.08966446962845]
offline reinforcement learning, which aims at optimizing decision-making strategies with historical data, has been extensively applied in real-life applications.
We take a step by considering offline reinforcement learning with differentiable function class approximation (DFA)
Most importantly, we show offline differentiable function approximation is provably efficient by analyzing the pessimistic fitted Q-learning algorithm.
arXiv Detail & Related papers (2022-10-03T07:59:42Z) - Benchmarking Heterogeneous Treatment Effect Models through the Lens of
Interpretability [82.29775890542967]
Estimating personalized effects of treatments is a complex, yet pervasive problem.
Recent developments in the machine learning literature on heterogeneous treatment effect estimation gave rise to many sophisticated, but opaque, tools.
We use post-hoc feature importance methods to identify features that influence the model's predictions.
arXiv Detail & Related papers (2022-06-16T17:59:05Z) - Stabilizing Q-learning with Linear Architectures for Provably Efficient
Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation.
We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z) - Neuroevolutionary Feature Representations for Causal Inference [0.0]
We propose a novel approach for learning feature representations to aid the estimation of the conditional average treatment effect or CATE.
Our method focuses on an intermediate layer in a neural network trained to predict the outcome from the features.
arXiv Detail & Related papers (2022-05-21T09:13:04Z) - Bellman Residual Orthogonalization for Offline Reinforcement Learning [53.17258888552998]
We introduce a new reinforcement learning principle that approximates the Bellman equations by enforcing their validity only along a test function space.
We exploit this principle to derive confidence intervals for off-policy evaluation, as well as to optimize over policies within a prescribed policy class.
arXiv Detail & Related papers (2022-03-24T01:04:17Z) - A Novel Tropical Geometry-based Interpretable Machine Learning Method:
Application in Prognosis of Advanced Heart Failure [4.159216572695661]
A model's interpretability is essential to many practical applications such as clinical decision support systems.
A novel interpretable machine learning method is presented, which can model the relationship between input variables and responses in humanly understandable rules.
arXiv Detail & Related papers (2021-12-09T17:53:12Z) - Unifying Gradient Estimators for Meta-Reinforcement Learning via
Off-Policy Evaluation [53.83642844626703]
We provide a unifying framework for estimating higher-order derivatives of value functions, based on off-policy evaluation.
Our framework interprets a number of prior approaches as special cases and elucidates the bias and variance trade-off of Hessian estimates.
arXiv Detail & Related papers (2021-06-24T15:58:01Z) - Variance-Aware Off-Policy Evaluation with Linear Function Approximation [85.75516599931632]
We study the off-policy evaluation problem in reinforcement learning with linear function approximation.
We propose an algorithm, VA-OPE, which uses the estimated variance of the value function to reweight the Bellman residual in Fitted Q-Iteration.
arXiv Detail & Related papers (2021-06-22T17:58:46Z) - Autonomous Learning of Features for Control: Experiments with Embodied
and Situated Agents [0.0]
We introduce a method that permits to continue the training of the feature-extraction module during the training of the policy network.
We show that sequence-to-sequence learning yields better results than the methods considered in previous studies.
arXiv Detail & Related papers (2020-09-15T14:34:42Z) - Robust Q-learning [0.0]
We propose a robust Q-learning approach which allows estimating nuisance parameters using data-adaptive techniques.
We study the behavior of our estimators and provide simulation studies that highlight the need for and usefulness of the proposed method.
arXiv Detail & Related papers (2020-03-27T14:10:38Z) - Interpretable Off-Policy Evaluation in Reinforcement Learning by
Highlighting Influential Transitions [48.91284724066349]
Off-policy evaluation in reinforcement learning offers the chance of using observational data to improve future outcomes in domains such as healthcare and education.
Traditional measures such as confidence intervals may be insufficient due to noise, limited data and confounding.
We develop a method that could serve as a hybrid human-AI system, to enable human experts to analyze the validity of policy evaluation estimates.
arXiv Detail & Related papers (2020-02-10T00:26:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.