Nonparametric Additive Value Functions: Interpretable Reinforcement
Learning with an Application to Surgical Recovery
- URL: http://arxiv.org/abs/2308.13135v1
- Date: Fri, 25 Aug 2023 02:05:51 GMT
- Title: Nonparametric Additive Value Functions: Interpretable Reinforcement
Learning with an Application to Surgical Recovery
- Authors: Patrick Emedom-Nnamdi, Timothy R. Smith, Jukka-Pekka Onnela, and
Junwei Lu
- Abstract summary: We propose a nonparametric additive model for estimating interpretable value functions in reinforcement learning.
We validate the proposed approach with a simulation study, and, in an application to spine disease, uncover recovery recommendations that are inline with related clinical knowledge.
- Score: 8.890206493793878
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a nonparametric additive model for estimating interpretable value
functions in reinforcement learning. Learning effective adaptive clinical
interventions that rely on digital phenotyping features is a major for concern
medical practitioners. With respect to spine surgery, different post-operative
recovery recommendations concerning patient mobilization can lead to
significant variation in patient recovery. While reinforcement learning has
achieved widespread success in domains such as games, recent methods heavily
rely on black-box methods, such neural networks. Unfortunately, these methods
hinder the ability of examining the contribution each feature makes in
producing the final suggested decision. While such interpretations are easily
provided in classical algorithms such as Least Squares Policy Iteration, basic
linearity assumptions prevent learning higher-order flexible interactions
between features. In this paper, we present a novel method that offers a
flexible technique for estimating action-value functions without making
explicit parametric assumptions regarding their additive functional form. This
nonparametric estimation strategy relies on incorporating local kernel
regression and basis expansion to obtain a sparse, additive representation of
the action-value function. Under this approach, we are able to locally
approximate the action-value function and retrieve the nonlinear, independent
contribution of select features as well as joint feature pairs. We validate the
proposed approach with a simulation study, and, in an application to spine
disease, uncover recovery recommendations that are inline with related clinical
knowledge.
Related papers
- Offline Reinforcement Learning with Differentiable Function
Approximation is Provably Efficient [65.08966446962845]
offline reinforcement learning, which aims at optimizing decision-making strategies with historical data, has been extensively applied in real-life applications.
We take a step by considering offline reinforcement learning with differentiable function class approximation (DFA)
Most importantly, we show offline differentiable function approximation is provably efficient by analyzing the pessimistic fitted Q-learning algorithm.
arXiv Detail & Related papers (2022-10-03T07:59:42Z) - Benchmarking Heterogeneous Treatment Effect Models through the Lens of
Interpretability [82.29775890542967]
Estimating personalized effects of treatments is a complex, yet pervasive problem.
Recent developments in the machine learning literature on heterogeneous treatment effect estimation gave rise to many sophisticated, but opaque, tools.
We use post-hoc feature importance methods to identify features that influence the model's predictions.
arXiv Detail & Related papers (2022-06-16T17:59:05Z) - Stabilizing Q-learning with Linear Architectures for Provably Efficient
Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation.
We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z) - Neuroevolutionary Feature Representations for Causal Inference [0.0]
We propose a novel approach for learning feature representations to aid the estimation of the conditional average treatment effect or CATE.
Our method focuses on an intermediate layer in a neural network trained to predict the outcome from the features.
arXiv Detail & Related papers (2022-05-21T09:13:04Z) - A Novel Tropical Geometry-based Interpretable Machine Learning Method:
Application in Prognosis of Advanced Heart Failure [4.159216572695661]
A model's interpretability is essential to many practical applications such as clinical decision support systems.
A novel interpretable machine learning method is presented, which can model the relationship between input variables and responses in humanly understandable rules.
arXiv Detail & Related papers (2021-12-09T17:53:12Z) - Unifying Gradient Estimators for Meta-Reinforcement Learning via
Off-Policy Evaluation [53.83642844626703]
We provide a unifying framework for estimating higher-order derivatives of value functions, based on off-policy evaluation.
Our framework interprets a number of prior approaches as special cases and elucidates the bias and variance trade-off of Hessian estimates.
arXiv Detail & Related papers (2021-06-24T15:58:01Z) - Variance-Aware Off-Policy Evaluation with Linear Function Approximation [85.75516599931632]
We study the off-policy evaluation problem in reinforcement learning with linear function approximation.
We propose an algorithm, VA-OPE, which uses the estimated variance of the value function to reweight the Bellman residual in Fitted Q-Iteration.
arXiv Detail & Related papers (2021-06-22T17:58:46Z) - Autonomous Learning of Features for Control: Experiments with Embodied
and Situated Agents [0.0]
We introduce a method that permits to continue the training of the feature-extraction module during the training of the policy network.
We show that sequence-to-sequence learning yields better results than the methods considered in previous studies.
arXiv Detail & Related papers (2020-09-15T14:34:42Z) - Interpretable Off-Policy Evaluation in Reinforcement Learning by
Highlighting Influential Transitions [48.91284724066349]
Off-policy evaluation in reinforcement learning offers the chance of using observational data to improve future outcomes in domains such as healthcare and education.
Traditional measures such as confidence intervals may be insufficient due to noise, limited data and confounding.
We develop a method that could serve as a hybrid human-AI system, to enable human experts to analyze the validity of policy evaluation estimates.
arXiv Detail & Related papers (2020-02-10T00:26:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.