Related papers: AMUSE: Adaptive Model Updating using a Simulated Environment

AMUSE: Adaptive Model Updating using a Simulated Environment

URL: http://arxiv.org/abs/2412.10119v1
Date: Fri, 13 Dec 2024 13:04:46 GMT
Title: AMUSE: Adaptive Model Updating using a Simulated Environment
Authors: Louis Chislett, Catalina A. Vallejos, Timothy I. Cannings, James Liley,
Abstract summary: Prediction models frequently face the challenge of concept drift, in which the underlying data distribution changes over time, weakening performance.<n>We present AMUSE, a novel method leveraging reinforcement learning trained within a simulated data generating environment.<n>As a result, AMUSE proactively recommends updates based on estimated performance improvements.
Score: 1.6124402884077915
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Prediction models frequently face the challenge of concept drift, in which the underlying data distribution changes over time, weakening performance. Examples can include models which predict loan default, or those used in healthcare contexts. Typical management strategies involve regular model updates or updates triggered by concept drift detection. However, these simple policies do not necessarily balance the cost of model updating with improved classifier performance. We present AMUSE (Adaptive Model Updating using a Simulated Environment), a novel method leveraging reinforcement learning trained within a simulated data generating environment, to determine update timings for classifiers. The optimal updating policy depends on the current data generating process and ongoing drift process. Our key idea is that we can train an arbitrarily complex model updating policy by creating a training environment in which possible episodes of drift are simulated by a parametric model, which represents expectations of possible drift patterns. As a result, AMUSE proactively recommends updates based on estimated performance improvements, learning a policy that balances maintaining model performance with minimizing update costs. Empirical results confirm the effectiveness of AMUSE in simulated data.

Related papers

$V_0$: A Generalist Value Model for Any Policy at State Zero [80.7505802128501]
Policy methods rely on a baseline to measure the relative advantage of an action.<n>This baseline is typically estimated by a Value Model (Critic) often as large as the policy model itself.<n>We propose a Generalist Value Model capable of estimating the expected performance of any model on unseen prompts.
arXiv Detail & Related papers (2026-02-03T14:35:23Z)
Model-Based Policy Adaptation for Closed-Loop End-to-End Autonomous Driving [54.46325690390831]
We propose Model-based Policy Adaptation (MPA), a general framework that enhances the robustness and safety of pretrained E2E driving agents during deployment.<n>MPA first generates diverse counterfactual trajectories using a geometry-consistent simulation engine.<n>MPA trains a diffusion-based policy adapter to refine the base policy's predictions and a multi-step Q value model to evaluate long-term outcomes.
arXiv Detail & Related papers (2025-11-26T17:01:41Z)
Reinforcement Learning in Queue-Reactive Models: Application to Optimal Execution [0.35932002706017546]
We investigate the use of Reinforcement Learning for the optimal execution of meta-orders.<n>The objective is to execute incrementally large orders while minimizing implementation shortfall and market impact.<n>We employ the Queue-Reactive Model to generate realistic and tractable limit order book simulations.
arXiv Detail & Related papers (2025-11-19T09:26:23Z)
Reinforcement Learning for Machine Learning Model Deployment: Evaluating Multi-Armed Bandits in ML Ops Environments [0.0]
We investigate whether reinforcement learning (RL)-based model management can manage deployment decisions more effectively. Our approach enables more adaptive production environments by continuously evaluating deployed models and rolling back underperforming ones in real-time. Our findings suggest that RL-based model management can improve automation, reduce reliance on manual interventions, and mitigate risks associated with post-deployment model failures.
arXiv Detail & Related papers (2025-03-28T16:42:21Z)
On conditional diffusion models for PDE simulations [53.01911265639582]
We study score-based diffusion models for forecasting and assimilation of sparse observations. We propose an autoregressive sampling approach that significantly improves performance in forecasting. We also propose a new training strategy for conditional score-based models that achieves stable performance over a range of history lengths.
arXiv Detail & Related papers (2024-10-21T18:31:04Z)
Self-Augmented Preference Optimization: Off-Policy Paradigms for Language Model Alignment [104.18002641195442]
We introduce Self-Augmented Preference Optimization (SAPO), an effective and scalable training paradigm that does not require existing paired data. Building on the self-play concept, which autonomously generates negative responses, we further incorporate an off-policy learning pipeline to enhance data exploration and exploitation.
arXiv Detail & Related papers (2024-05-31T14:21:04Z)
Fast-Slow Test-Time Adaptation for Online Vision-and-Language Navigation [67.18144414660681]
We propose a Fast-Slow Test-Time Adaptation (FSTTA) approach for online Vision-and-Language Navigation (VLN) Our method obtains impressive performance gains on four popular benchmarks.
arXiv Detail & Related papers (2023-11-22T07:47:39Z)
STORM: Efficient Stochastic Transformer based World Models for Reinforcement Learning [82.03481509373037]
Recently, model-based reinforcement learning algorithms have demonstrated remarkable efficacy in visual input environments. We introduce Transformer-based wORld Model (STORM), an efficient world model architecture that combines strong modeling and generation capabilities. Storm achieves a mean human performance of $126.7%$ on the Atari $100$k benchmark, setting a new record among state-of-the-art methods.
arXiv Detail & Related papers (2023-10-14T16:42:02Z)
How to Fine-tune the Model: Unified Model Shift and Model Bias Policy Optimization [13.440645736306267]
This paper develops an algorithm for model-based reinforcement learning. It unifies model shift and model bias and then formulates a fine-tuning process. It achieves state-of-the-art performance on several challenging benchmark tasks.
arXiv Detail & Related papers (2023-09-22T07:27:32Z)
End-to-End Reinforcement Learning of Koopman Models for Economic Nonlinear Model Predictive Control [45.84205238554709]
We present a method for reinforcement learning of Koopman surrogate models for optimal performance as part of (e)NMPC. We show that the end-to-end trained models outperform those trained using system identification in (e)NMPC.
arXiv Detail & Related papers (2023-08-03T10:21:53Z)
Federated Privacy-preserving Collaborative Filtering for On-Device Next App Prediction [52.16923290335873]
We propose a novel SeqMF model to solve the problem of predicting the next app launch during mobile device usage. We modify the structure of the classical matrix factorization model and update the training procedure to sequential learning. One more ingredient of the proposed approach is a new privacy mechanism that guarantees the protection of the sent data from the users to the remote server.
arXiv Detail & Related papers (2023-02-05T10:29:57Z)
When to Update Your Model: Constrained Model-based Reinforcement Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL) Our follow-up derived bounds reveal the relationship between model shifts and performance improvement. A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z)
How do I update my model? On the resilience of Predictive Process Monitoring models to change [15.29342790344802]
Predictive Process Monitoring techniques typically construct a predictive model based on past process executions, and then use it to predict the future of new ongoing cases. This can make Predictive Process Monitoring too rigid to deal with the variability of processes working in real environments. We evaluate the use of three different strategies that allow the periodic rediscovery or incremental construction of the predictive model.
arXiv Detail & Related papers (2021-09-08T08:50:56Z)
Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization [60.73540999409032]
We show that expressive autoregressive dynamics models generate different dimensions of the next state and reward sequentially conditioned on previous dimensions. We also show that autoregressive dynamics models are useful for offline policy optimization by serving as a way to enrich the replay buffer.
arXiv Detail & Related papers (2021-04-28T16:48:44Z)
Model-based Policy Optimization with Unsupervised Model Adaptation [37.09948645461043]
We investigate how to bridge the gap between real and simulated data due to inaccurate model estimation for better policy optimization. We propose a novel model-based reinforcement learning framework AMPO, which introduces unsupervised model adaptation. Our approach achieves state-of-the-art performance in terms of sample efficiency on a range of continuous control benchmark tasks.
arXiv Detail & Related papers (2020-10-19T14:19:42Z)
Reinforcement Learning based dynamic weighing of Ensemble Models for Time Series Forecasting [0.8399688944263843]
It is known that if models selected for data modelling are distinct (linear/non-linear, static/dynamic) and independent (minimally correlated) models, the accuracy of the predictions is improved. Various approaches suggested in the literature to weigh the ensemble models use a static set of weights. To address this issue, a Reinforcement Learning (RL) approach to dynamically assign and update weights of each of the models at different time instants.
arXiv Detail & Related papers (2020-08-20T10:40:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.