Transfer Learning for Classification under Decision Rule Drift with Application to Optimal Individualized Treatment Rule Estimation
- URL: http://arxiv.org/abs/2508.20942v1
- Date: Thu, 28 Aug 2025 16:03:06 GMT
- Title: Transfer Learning for Classification under Decision Rule Drift with Application to Optimal Individualized Treatment Rule Estimation
- Authors: Xiaohan Wang, Yang Ning,
- Abstract summary: We propose a novel methodology for modeling posterior drift through Bayes decision rules.<n>Under mild regularity conditions, we establish the consistency of our estimators and derive the risk bounds.<n>We illustrate the broad applicability of our method by adapting it to the estimation of optimal individualized treatment rules.
- Score: 50.34670342434884
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we extend the transfer learning classification framework from regression function-based methods to decision rules. We propose a novel methodology for modeling posterior drift through Bayes decision rules. By exploiting the geometric transformation of the Bayes decision boundary, our method reformulates the problem as a low-dimensional empirical risk minimization problem. Under mild regularity conditions, we establish the consistency of our estimators and derive the risk bounds. Moreover, we illustrate the broad applicability of our method by adapting it to the estimation of optimal individualized treatment rules. Extensive simulation studies and analyses of real-world data further demonstrate both superior performance and robustness of our approach.
Related papers
- On the System Theoretic Offline Learning of Continuous-Time LQR with Exogenous Disturbances [3.701656361145375]
We analyze offline designs of linear quadratic regulator (LQR) strategies with uncertain disturbances.<n>Our approach builds on the fundamental learning-based framework of adaptive dynamic programming.
arXiv Detail & Related papers (2025-09-20T17:14:27Z) - Achieving $\widetilde{\mathcal{O}}(\sqrt{T})$ Regret in Average-Reward POMDPs with Known Observation Models [56.92178753201331]
We tackle average-reward infinite-horizon POMDPs with an unknown transition model.<n>We present a novel and simple estimator that overcomes this barrier.
arXiv Detail & Related papers (2025-01-30T22:29:41Z) - Sublinear Regret for a Class of Continuous-Time Linear-Quadratic Reinforcement Learning Problems [10.404992912881601]
We study reinforcement learning (RL) for a class of continuous-time linear-quadratic (LQ) control problems for diffusions.<n>We apply a model-free approach that relies neither on knowledge of model parameters nor on their estimations, and devise an RL algorithm to learn the optimal policy parameter directly.
arXiv Detail & Related papers (2024-07-24T12:26:21Z) - Optimal Baseline Corrections for Off-Policy Contextual Bandits [61.740094604552475]
We aim to learn decision policies that optimize an unbiased offline estimate of an online reward metric.
We propose a single framework built on their equivalence in learning scenarios.
Our framework enables us to characterize the variance-optimal unbiased estimator and provide a closed-form solution for it.
arXiv Detail & Related papers (2024-05-09T12:52:22Z) - Globally-Optimal Greedy Experiment Selection for Active Sequential
Estimation [1.1530723302736279]
We study the problem of active sequential estimation, which involves adaptively selecting experiments for sequentially collected data.
The goal is to design experiment selection rules for more accurate model estimation.
We propose a class of greedy experiment selection methods and provide statistical analysis for the maximum likelihood.
arXiv Detail & Related papers (2024-02-13T17:09:29Z) - Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization [59.758009422067]
We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning.
We propose a new uncertainty Bellman equation (UBE) whose solution converges to the true posterior variance over values.
We introduce a general-purpose policy optimization algorithm, Q-Uncertainty Soft Actor-Critic (QU-SAC) that can be applied for either risk-seeking or risk-averse policy optimization.
arXiv Detail & Related papers (2023-12-07T15:55:58Z) - Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences.
Our method is especially suitable for problems with well-specified likelihoods.
We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z) - Offline Policy Optimization with Eligible Actions [34.4530766779594]
offline policy optimization could have a large impact on many real-world decision-making problems.
Importance sampling and its variants are a commonly used type of estimator in offline policy evaluation.
We propose an algorithm to avoid this overfitting through a new per-state-neighborhood normalization constraint.
arXiv Detail & Related papers (2022-07-01T19:18:15Z) - Limitations of a proposed correction for slow drifts in decision
criterion [0.0]
We propose a model-based approach for disambiguating systematic updates from random drifts.
We show that this approach accurately recovers the latent trajectory of drifts in decision criterion.
Our results highlight the advantages of incorporating assumptions about the generative process directly into models of decision-making.
arXiv Detail & Related papers (2022-05-22T19:33:19Z) - Off-Policy Evaluation with Policy-Dependent Optimization Response [90.28758112893054]
We develop a new framework for off-policy evaluation with a textitpolicy-dependent linear optimization response.
We construct unbiased estimators for the policy-dependent estimand by a perturbation method.
We provide a general algorithm for optimizing causal interventions.
arXiv Detail & Related papers (2022-02-25T20:25:37Z) - Sparse Methods for Automatic Relevance Determination [0.0]
We first review automatic relevance determination (ARD) and analytically demonstrate the need to additional regularization or thresholding to achieve sparse models.
We then discuss two classes of methods, regularization based and thresholding based, which build on ARD to learn parsimonious solutions to linear problems.
arXiv Detail & Related papers (2020-05-18T14:08:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.