Deep Spectral Q-learning with Application to Mobile Health
- URL: http://arxiv.org/abs/2301.00927v1
- Date: Tue, 3 Jan 2023 01:55:17 GMT
- Title: Deep Spectral Q-learning with Application to Mobile Health
- Authors: Yuhe Gao, Chengchun Shi and Rui Song
- Abstract summary: We propose a deep spectral Q-learning algorithm to handle mixed frequency data.
In theory, we prove that the mean return under the estimated optimal policy converges to that under the optimal one and establish its rate of convergence.
- Score: 11.736014576781903
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dynamic treatment regimes assign personalized treatments to patients
sequentially over time based on their baseline information and time-varying
covariates. In mobile health applications, these covariates are typically
collected at different frequencies over a long time horizon. In this paper, we
propose a deep spectral Q-learning algorithm, which integrates principal
component analysis (PCA) with deep Q-learning to handle the mixed frequency
data. In theory, we prove that the mean return under the estimated optimal
policy converges to that under the optimal one and establish its rate of
convergence. The usefulness of our proposal is further illustrated via
simulations and an application to a diabetes dataset.
Related papers
- Online Statistical Inference for Time-varying Sample-averaged Q-learning [2.2374171443798034]
This paper introduces a time-varying batch-averaged Q-learning, termed sampleaveraged Q-learning.
We develop a novel framework that provides insights into the normality of the sample-averaged algorithm under mild conditions.
Numerical experiments conducted on classic OpenAI Gym environments show that the time-varying sample-averaged Q-learning method consistently outperforms both single-sample and constant-batch Q-learning.
arXiv Detail & Related papers (2024-10-14T17:17:19Z) - Convolutional Monge Mapping Normalization for learning on sleep data [63.22081662149488]
We propose a new method called Convolutional Monge Mapping Normalization (CMMN)
CMMN consists in filtering the signals in order to adapt their power spectrum density (PSD) to a Wasserstein barycenter estimated on training data.
Numerical experiments on sleep EEG data show that CMMN leads to significant and consistent performance gains independent from the neural network architecture.
arXiv Detail & Related papers (2023-05-30T08:24:01Z) - BCQQ: Batch-Constraint Quantum Q-Learning with Cyclic Data Re-uploading [2.502222151305252]
Recent advancements in quantum computing suggest that quantum models might require less data for training compared to classical methods.
We propose a batch RL algorithm that utilizes VQC as function approximators within the discrete batch-constraint deep Q-learning algorithm.
We evaluate the efficiency of our algorithm on the OpenAI CartPole environment and compare its performance to the classical neural network-based discrete BCQ.
arXiv Detail & Related papers (2023-04-27T16:43:01Z) - Continuous-Time Modeling of Counterfactual Outcomes Using Neural
Controlled Differential Equations [84.42837346400151]
Estimating counterfactual outcomes over time has the potential to unlock personalized healthcare.
Existing causal inference approaches consider regular, discrete-time intervals between observations and treatment decisions.
We propose a controllable simulation environment based on a model of tumor growth for a range of scenarios.
arXiv Detail & Related papers (2022-06-16T17:15:15Z) - Federated Offline Reinforcement Learning [55.326673977320574]
We propose a multi-site Markov decision process model that allows for both homogeneous and heterogeneous effects across sites.
We design the first federated policy optimization algorithm for offline RL with sample complexity.
We give a theoretical guarantee for the proposed algorithm, where the suboptimality for the learned policies is comparable to the rate as if data is not distributed.
arXiv Detail & Related papers (2022-06-11T18:03:26Z) - Conformal Prediction with Temporal Quantile Adjustments [40.282423098764404]
We develop a method to construct efficient and valid prediction intervals (PIs) for regression on cross-sectional time series data.
We validate TQA's performance through extensive experimentation.
arXiv Detail & Related papers (2022-05-20T03:31:03Z) - LSTM-Autoencoder based Anomaly Detection for Indoor Air Quality Time
Series Data [6.642599588462097]
Anomaly detection for indoor air quality (IAQ) data has become an important area of research as the quality of air is closely related to human health and well-being.
Traditional statistics and machine learning-based approaches in anomaly detection in the IAQ area could not detect anomalies involving the observation of correlations across several data points.
We propose a hybrid deep learning model that combines LSTM with Autoencoder for anomaly detection tasks in IAQ to address this issue.
arXiv Detail & Related papers (2022-04-14T01:57:46Z) - Reinforcement Learning with Heterogeneous Data: Estimation and Inference [84.72174994749305]
We introduce the K-Heterogeneous Markov Decision Process (K-Hetero MDP) to address sequential decision problems with population heterogeneity.
We propose the Auto-Clustered Policy Evaluation (ACPE) for estimating the value of a given policy, and the Auto-Clustered Policy Iteration (ACPI) for estimating the optimal policy in a given policy class.
We present simulations to support our theoretical findings, and we conduct an empirical study on the standard MIMIC-III dataset.
arXiv Detail & Related papers (2022-01-31T20:58:47Z) - Deep Cellular Recurrent Network for Efficient Analysis of Time-Series
Data with Spatial Information [52.635997570873194]
This work proposes a novel deep cellular recurrent neural network (DCRNN) architecture to process complex multi-dimensional time series data with spatial information.
The proposed architecture achieves state-of-the-art performance while utilizing substantially less trainable parameters when compared to comparable methods in the literature.
arXiv Detail & Related papers (2021-01-12T20:08:18Z) - Large-scale Augmented Granger Causality (lsAGC) for Connectivity
Analysis in Complex Systems: From Computer Simulations to Functional MRI
(fMRI) [0.0]
We introduce large-scale Augmented Granger Causality (lsAGC) as a method for connectivity analysis in complex systems.
lsAGC algorithm combines dimension reduction with source time-series augmentation.
We quantitatively evaluate the performance of lsAGC on synthetic directional time-series networks with known ground truth.
arXiv Detail & Related papers (2021-01-10T01:44:48Z) - DeepRite: Deep Recurrent Inverse TreatmEnt Weighting for Adjusting
Time-varying Confounding in Modern Longitudinal Observational Data [68.29870617697532]
We propose Deep Recurrent Inverse TreatmEnt weighting (DeepRite) for time-varying confounding in longitudinal data.
DeepRite is shown to recover the ground truth from synthetic data, and estimate unbiased treatment effects from real data.
arXiv Detail & Related papers (2020-10-28T15:05:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.