Related papers: Optimizing Algorithms for Mobile Health Interventions with Active Querying Optimization

Optimizing Algorithms for Mobile Health Interventions with Active Querying Optimization

URL: http://arxiv.org/abs/2512.08950v1
Date: Thu, 27 Nov 2025 14:21:47 GMT
Title: Optimizing Algorithms for Mobile Health Interventions with Active Querying Optimization
Authors: Aseel Rawashdeh,
Abstract summary: Reinforcement learning in mobile health interventions requires balancing intervention efficacy with user burden.<n>The Act-Then-Measure (ATM) algorithm relies on a temporal-difference-inspired Q-learning method, which is prone to instability in sparse and noisy environments.<n>We propose a Bayesian extension to ATM that replaces standard Q-learning with a Kalman filter-style Bayesian update, maintaining uncertainty-aware estimates of Q-values.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Reinforcement learning in mobile health (mHealth) interventions requires balancing intervention efficacy with user burden, particularly when state measurements (for example, user surveys or feedback) are costly yet essential. The Act-Then-Measure (ATM) heuristic addresses this challenge by decoupling control and measurement actions within the Action-Contingent Noiselessly Observable Markov Decision Process (ACNO-MDP) framework. However, the standard ATM algorithm relies on a temporal-difference-inspired Q-learning method, which is prone to instability in sparse and noisy environments. In this work, we propose a Bayesian extension to ATM that replaces standard Q-learning with a Kalman filter-style Bayesian update, maintaining uncertainty-aware estimates of Q-values and enabling more stable and sample-efficient learning. We evaluate our method in both toy environments and clinically motivated testbeds. In small, tabular environments, Bayesian ATM achieves comparable or improved scalarized returns with substantially lower variance and more stable policy behavior. In contrast, in larger and more complex mHealth settings, both the standard and Bayesian ATM variants perform poorly, suggesting a mismatch between ATM's modeling assumptions and the structural challenges of real-world mHealth domains. These findings highlight the value of uncertainty-aware methods in low-data settings while underscoring the need for new RL algorithms that explicitly model causal structure, continuous states, and delayed feedback under observation cost constraints.

Related papers

AgentNoiseBench: Benchmarking Robustness of Tool-Using LLM Agents Under Noisy Condition [72.24180896265192]
We introduce AgentNoiseBench, a framework for evaluating robustness of agentic models under noisy environments.<n>We first conduct an in-depth analysis of biases and uncertainties in real-world scenarios.<n>We then categorize environmental noise into two primary types: user-noise and tool-noise.<n>Building on this analysis, we develop an automated pipeline that injects controllable noise into existing agent-centric benchmarks.
arXiv Detail & Related papers (2026-02-11T20:33:10Z)
Enhancing Trustworthiness with Mixed Precision: Benchmarks, Opportunities, and Challenges [12.438306093697]
Large language models (LLMs) have shown promising performance across various tasks.<n>LLMs' autoregressive decoding process poses significant challenges for efficient deployment on existing AI hardware.
arXiv Detail & Related papers (2025-11-27T14:17:43Z)
FAIM: Frequency-Aware Interactive Mamba for Time Series Classification [87.84511960413715]
Time series classification (TSC) is crucial in numerous real-world applications, such as environmental monitoring, medical diagnosis, and posture recognition.<n>We propose FAIM, a lightweight Frequency-Aware Interactive Mamba model.<n>We show that FAIM consistently outperforms existing state-of-the-art (SOTA) methods, achieving a superior trade-off between accuracy and efficiency.
arXiv Detail & Related papers (2025-11-26T08:36:33Z)
Time-Aware Feature Selection: Adaptive Temporal Masking for Stable Sparse Autoencoder Training [0.47745223151611654]
We introduce Adaptive Temporal Masking (ATM), a novel training approach that adjusts feature selection by tracking activation magnitudes, frequencies, and reconstruction contributions to compute importance scores that evolve over time.<n> ATM achieves substantially lower absorption scores compared to existing methods like TopK and JumpReLU SAEs, while maintaining excellent reconstruction quality.
arXiv Detail & Related papers (2025-10-09T23:12:51Z)
Dynamic Uncertainty-aware Multimodal Fusion for Outdoor Health Monitoring [14.465453649354531]
Multimodal large language models (MLLMs) emerge as a promising alternative.<n> MLLMs fail to capture subtle health status changes due to input and fluctuation noise.<n>We propose a multimodal fusion framework, named multimodal-Health, for outdoor health monitoring in dynamic and noisy environments.
arXiv Detail & Related papers (2025-08-12T17:07:27Z)
RoHOI: Robustness Benchmark for Human-Object Interaction Detection [84.78366452133514]
Human-Object Interaction (HOI) detection is crucial for robot-human assistance, enabling context-aware support.<n>We introduce the first benchmark for HOI detection, evaluating model resilience under diverse challenges.<n>Our benchmark, RoHOI, includes 20 corruption types based on the HICO-DET and V-COCO datasets and a new robustness-focused metric.
arXiv Detail & Related papers (2025-07-12T01:58:04Z)
Distribution-Free Uncertainty Quantification in Mechanical Ventilation Treatment: A Conformal Deep Q-Learning Framework [2.5070297884580874]
This study introduces ConformalDQN, a distribution-free conformal deep Q-learning approach for optimizing mechanical ventilation in intensive care units.<n>We trained and evaluated our model using ICU patient records from the MIMIC-IV database.
arXiv Detail & Related papers (2024-12-17T06:55:20Z)
Robust Reinforcement Learning under Diffusion Models for Data with Jumps [40.2559197706778]
We introduce the Mean-Square Bipower Variation Error (MSBVE) algorithm, which enhances robustness and convergence in scenarios involving significant noise and jumps.<n>We first revisit the Mean-Square TD Error (MSTDE) algorithm, commonly used in continuous-time RL, and highlight its limitations in handling jumps in state dynamics.<n>The proposed MSBVE algorithm minimizes the mean-square quadratic variation error, offering improved performance over MSTDE in environments characterized by SDEs with jumps.
arXiv Detail & Related papers (2024-11-18T16:17:34Z)
Bisimulation metric for Model Predictive Control [44.301098448479195]
Bisimulation Metric for Model Predictive Control (BS-MPC) is a novel approach that incorporates bisimulation metric loss in its objective function to directly optimize the encoder. BS-MPC improves training stability, robustness against input noise, and computational efficiency by reducing training time. We evaluate BS-MPC on both continuous control and image-based tasks from the DeepMind Control Suite.
arXiv Detail & Related papers (2024-10-06T17:12:10Z)
Efficient and Sharp Off-Policy Evaluation in Robust Markov Decision Processes [44.974100402600165]
We study the evaluation of a policy best-parametric and worst-case perturbations to a decision process (MDP) We use transition observations from the original MDP, whether they are generated under the same or a different policy. Our estimator is also estimated statistical inference using Wald confidence intervals.
arXiv Detail & Related papers (2024-03-29T18:11:49Z)
Sub-linear Regret in Adaptive Model Predictive Control [56.705978425244496]
We present STT-MPC (Self-Tuning Tube-based Model Predictive Control), an online oracle that combines the certainty-equivalence principle and polytopic tubes. We analyze the regret of the algorithm, when compared to an algorithm initially aware of the system dynamics.
arXiv Detail & Related papers (2023-10-07T15:07:10Z)
On the Practicality of Deterministic Epistemic Uncertainty [106.06571981780591]
deterministic uncertainty methods (DUMs) achieve strong performance on detecting out-of-distribution data. It remains unclear whether DUMs are well calibrated and can seamlessly scale to real-world applications.
arXiv Detail & Related papers (2021-07-01T17:59:07Z)
Robust Value Iteration for Continuous Control Tasks [99.00362538261972]
When transferring a control policy from simulation to a physical system, the policy needs to be robust to variations in the dynamics to perform well. We present Robust Fitted Value Iteration, which uses dynamic programming to compute the optimal value function on the compact state domain. We show that robust value is more robust compared to deep reinforcement learning algorithm and the non-robust version of the algorithm.
arXiv Detail & Related papers (2021-05-25T19:48:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.