Designing Reinforcement Learning Algorithms for Digital Interventions:
Pre-implementation Guidelines
- URL: http://arxiv.org/abs/2206.03944v1
- Date: Wed, 8 Jun 2022 15:05:28 GMT
- Title: Designing Reinforcement Learning Algorithms for Digital Interventions:
Pre-implementation Guidelines
- Authors: Anna L. Trella, Kelly W. Zhang, Inbal Nahum-Shani, Vivek Shetty,
Finale Doshi-Velez, Susan A. Murphy
- Abstract summary: Online reinforcement learning algorithms are increasingly used to personalize digital interventions in the fields of mobile health and online education.
Common challenges in designing and testing an RL algorithm in these settings include ensuring the RL algorithm can learn and run stably under real-time constraints.
We extend the PCS (Predictability, Computability, Stability) framework, a data science framework that incorporates best practices from machine learning and statistics in supervised learning.
- Score: 24.283342018185028
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Online reinforcement learning (RL) algorithms are increasingly used to
personalize digital interventions in the fields of mobile health and online
education. Common challenges in designing and testing an RL algorithm in these
settings include ensuring the RL algorithm can learn and run stably under
real-time constraints, and accounting for the complexity of the environment,
e.g., a lack of accurate mechanistic models for the user dynamics. To guide how
one can tackle these challenges, we extend the PCS (Predictability,
Computability, Stability) framework, a data science framework that incorporates
best practices from machine learning and statistics in supervised learning (Yu
and Kumbier, 2020), to the design of RL algorithms for the digital
interventions setting. Further, we provide guidelines on how to design
simulation environments, a crucial tool for evaluating RL candidate algorithms
using the PCS framework. We illustrate the use of the PCS framework for
designing an RL algorithm for Oralytics, a mobile health study aiming to
improve users' tooth-brushing behaviors through the personalized delivery of
intervention messages. Oralytics will go into the field in late 2022.
Related papers
- Monitoring Fidelity of Online Reinforcement Learning Algorithms in Clinical Trials [20.944037982124037]
This paper proposes algorithm fidelity as a critical requirement for deploying online RL algorithms in clinical trials.
We present a framework for pre-deployment planning and real-time monitoring to help algorithm developers and clinical researchers ensure algorithm fidelity.
arXiv Detail & Related papers (2024-02-26T20:19:14Z) - Machine Learning Insides OptVerse AI Solver: Design Principles and
Applications [74.67495900436728]
We present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI solver.
We showcase our methods for generating complex SAT and MILP instances utilizing generative models that mirror multifaceted structures of real-world problem.
We detail the incorporation of state-of-the-art parameter tuning algorithms which markedly elevate solver performance.
arXiv Detail & Related papers (2024-01-11T15:02:15Z) - Deep reinforcement learning for machine scheduling: Methodology, the
state-of-the-art, and future directions [2.4541568670428915]
Machine scheduling aims to optimize job assignments to machines while adhering to manufacturing rules and job specifications.
Deep Reinforcement Learning (DRL), a key component of artificial general intelligence, has shown promise in various domains like gaming and robotics.
This paper offers a comprehensive review and comparison of DRL-based approaches, highlighting their methodology, applications, advantages, and limitations.
arXiv Detail & Related papers (2023-10-04T22:45:09Z) - PILOT: A Pre-Trained Model-Based Continual Learning Toolbox [65.57123249246358]
This paper introduces a pre-trained model-based continual learning toolbox known as PILOT.
On the one hand, PILOT implements some state-of-the-art class-incremental learning algorithms based on pre-trained models, such as L2P, DualPrompt, and CODA-Prompt.
On the other hand, PILOT fits typical class-incremental learning algorithms within the context of pre-trained models to evaluate their effectiveness.
arXiv Detail & Related papers (2023-09-13T17:55:11Z) - Did we personalize? Assessing personalization by an online reinforcement
learning algorithm using resampling [9.745543921550748]
Reinforcement learning (RL) can be used to personalize sequences of treatments in digital health to support users in adopting healthier behaviors.
Online RL is a promising data-driven approach for this problem as it learns based on each user's historical responses.
We assess whether the RL algorithm should be included in an optimized'' intervention for real-world deployment.
arXiv Detail & Related papers (2023-04-11T17:20:37Z) - MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion
Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms.
Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return.
We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z) - Constraint Sampling Reinforcement Learning: Incorporating Expertise For
Faster Learning [43.562783189118]
We introduce a practical algorithm for incorporating human insight to speed learning.
Our algorithm, Constraint Sampling Reinforcement Learning (CSRL), incorporates prior domain knowledge as constraints/restrictions on the RL policy.
In all cases, CSRL learns a good policy faster than baselines.
arXiv Detail & Related papers (2021-12-30T22:02:42Z) - Reinforcement Learning for Datacenter Congestion Control [50.225885814524304]
Successful congestion control algorithms can dramatically improve latency and overall network throughput.
Until today, no such learning-based algorithms have shown practical potential in this domain.
We devise an RL-based algorithm with the aim of generalizing to different configurations of real-world datacenter networks.
We show that this scheme outperforms alternative popular RL approaches, and generalizes to scenarios that were not seen during training.
arXiv Detail & Related papers (2021-02-18T13:49:28Z) - AWAC: Accelerating Online Reinforcement Learning with Offline Datasets [84.94748183816547]
We show that our method, advantage weighted actor critic (AWAC), enables rapid learning of skills with a combination of prior demonstration data and online experience.
Our results show that incorporating prior data can reduce the time required to learn a range of robotic skills to practical time-scales.
arXiv Detail & Related papers (2020-06-16T17:54:41Z) - Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL.
We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.