Related papers: Spatiotemporal Forecasting as Planning: A Model-Based Reinforcement Learning Approach with Generative World Models

Spatiotemporal Forecasting as Planning: A Model-Based Reinforcement Learning Approach with Generative World Models

URL: http://arxiv.org/abs/2510.04020v3
Date: Fri, 10 Oct 2025 02:54:02 GMT
Title: Spatiotemporal Forecasting as Planning: A Model-Based Reinforcement Learning Approach with Generative World Models
Authors: Hao Wu, Yuan Gao, Xingjian Shi, Shuaipeng Li, Fan Xu, Fan Zhang, Zhihong Zhu, Weiyan Wang, Xiao Luo, Kun Wang, Xian Wu, Xiaomeng Huang,
Abstract summary: We propose SFP Forecasting as Planning (SFP), a new paradigm in Model Based Reinforcement Learning.<n>SFP constructs a novel World Model to simulate diverse high-temporal future states, enabling an "imagination-based" environmental simulation.
Score: 45.523937630646394
License: http://creativecommons.org/licenses/by/4.0/
Abstract: To address the dual challenges of inherent stochasticity and non-differentiable metrics in physical spatiotemporal forecasting, we propose Spatiotemporal Forecasting as Planning (SFP), a new paradigm grounded in Model-Based Reinforcement Learning. SFP constructs a novel Generative World Model to simulate diverse, high-fidelity future states, enabling an "imagination-based" environmental simulation. Within this framework, a base forecasting model acts as an agent, guided by a beam search-based planning algorithm that leverages non-differentiable domain metrics as reward signals to explore high-return future sequences. These identified high-reward candidates then serve as pseudo-labels to continuously optimize the agent's policy through iterative self-training, significantly reducing prediction error and demonstrating exceptional performance on critical domain metrics like capturing extreme events.

Related papers

Position: Beyond Model-Centric Prediction -- Agentic Time Series Forecasting [49.05788441962762]
We argue for agentic time series forecasting (ATSF), which reframes forecasting as an agentic process composed of perception, planning, action, reflection, and memory.<n>We outline three representative implementation paradigms -- workflow-based design, agentic reinforcement learning, and a hybrid agentic workflow paradigm -- and discuss the opportunities and challenges that arise when shifting from model-centric prediction to agentic forecasting.
arXiv Detail & Related papers (2026-02-02T08:01:11Z)
Generative Actor Critic [74.04971271003869]
Generative Actor Critic (GAC) is a novel framework that decouples sequential decision-making by reframing textitpolicy evaluation as learning a generative model of the joint distribution over trajectories and returns.<n>Experiments on Gym-MuJoCo and Maze2D benchmarks demonstrate GAC's strong offline performance and significantly enhanced offline-to-online improvement compared to state-of-the-art methods.
arXiv Detail & Related papers (2025-12-25T06:31:11Z)
Model-Based Policy Adaptation for Closed-Loop End-to-End Autonomous Driving [54.46325690390831]
We propose Model-based Policy Adaptation (MPA), a general framework that enhances the robustness and safety of pretrained E2E driving agents during deployment.<n>MPA first generates diverse counterfactual trajectories using a geometry-consistent simulation engine.<n>MPA trains a diffusion-based policy adapter to refine the base policy's predictions and a multi-step Q value model to evaluate long-term outcomes.
arXiv Detail & Related papers (2025-11-26T17:01:41Z)
Next Interest Flow: A Generative Pre-training Paradigm for Recommender Systems by Modeling All-domain Movelines [8.895768051554162]
We propose a novel generative pre-training paradigm for e-commerce recommender systems.<n>Our model learns to predict the Next Interest Flow, a dense vector sequence representing a user's future intent.<n>We present the All-domain Moveline Evolution Network (AMEN), a unified framework implementing our entire pipeline.
arXiv Detail & Related papers (2025-10-13T12:13:17Z)
How to model Human Actions distribution with Event Sequence Data [22.25731364559209]
We study the forecasting of the future distribution of events in human action sequences.<n>We find that a simple explicit distribution forecasting objective consistently surpasses complex implicit baselines.<n>This work provides a principled framework for selecting modeling strategies and offers practical guidance for building more accurate and robust forecasting systems.
arXiv Detail & Related papers (2025-10-07T12:24:54Z)
ScenGAN: Attention-Intensive Generative Model for Uncertainty-Aware Renewable Scenario Forecasting [11.600987173982107]
This paper explores uncertainties in the realms of renewable power and deep learning.<n>An uncertainty-aware model is meticulously designed for renewable scenario forecasting.<n>The integration of meteorological information, forecasts, and historical trajectories in the processing layer improves the synergistic forecasting capability.
arXiv Detail & Related papers (2025-09-21T15:18:51Z)
Adaptive Conformal Prediction Intervals Over Trajectory Ensembles [50.31074512684758]
Future trajectories play an important role across domains such as autonomous driving, hurricane forecasting, and epidemic modeling.<n>We propose a unified framework based on conformal prediction that transforms sampled trajectories into calibrated prediction intervals with theoretical coverage guarantees.
arXiv Detail & Related papers (2025-08-18T21:14:07Z)
Deep Active Inference Agents for Delayed and Long-Horizon Environments [1.693200946453174]
AIF agents rely on accurate immediate predictions and exhaustive planning, a limitation that is exacerbated in delayed environments.<n>We propose a generative-policy architecture featuring a multi-step latent transition that lets the generative model predict an entire horizon in a single look-ahead.<n>We evaluate our agent in an environment that mimics a realistic industrial scenario with delayed and long-horizon settings.
arXiv Detail & Related papers (2025-05-26T11:50:22Z)
On conditional diffusion models for PDE simulations [53.01911265639582]
We study score-based diffusion models for forecasting and assimilation of sparse observations. We propose an autoregressive sampling approach that significantly improves performance in forecasting. We also propose a new training strategy for conditional score-based models that achieves stable performance over a range of history lengths.
arXiv Detail & Related papers (2024-10-21T18:31:04Z)
GPT-ST: Generative Pre-Training of Spatio-Temporal Graph Neural Networks [24.323017830938394]
This work aims to address challenges by introducing a pre-training framework that seamlessly integrates with baselines and enhances their performance. The framework is built upon two key designs: (i) We propose a. apple-to-apple mask autoencoder as a pre-training model for learning-temporal dependencies. These modules are specifically designed to capture intra-temporal customized representations and semantic- and inter-cluster relationships.
arXiv Detail & Related papers (2023-11-07T02:36:24Z)
When Demonstrations Meet Generative World Models: A Maximum Likelihood Framework for Offline Inverse Reinforcement Learning [62.00672284480755]
This paper aims to recover the structure of rewards and environment dynamics that underlie observed actions in a fixed, finite set of demonstrations from an expert agent. Accurate models of expertise in executing a task has applications in safety-sensitive applications such as clinical decision making and autonomous driving.
arXiv Detail & Related papers (2023-02-15T04:14:20Z)
Forethought and Hindsight in Credit Assignment [62.05690959741223]
We work to understand the gains and peculiarities of planning employed as forethought via forward models or as hindsight operating with backward models. We investigate the best use of models in planning, primarily focusing on the selection of states in which predictions should be (re)-evaluated.
arXiv Detail & Related papers (2020-10-26T16:00:47Z)
Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference. We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z)
Goal-Directed Planning for Habituated Agents by Active Inference Using a Variational Recurrent Neural Network [5.000272778136268]
This study shows that the predictive coding (PC) and active inference (AIF) frameworks can develop better generalization by learning a prior distribution in a low dimensional latent state space. In our proposed model, learning is carried out by inferring optimal latent variables as well as synaptic weights for maximizing the evidence lower bound. Our proposed model was evaluated with both simple and complex robotic tasks in simulation, which demonstrated sufficient generalization in learning with limited training data.
arXiv Detail & Related papers (2020-05-27T06:43:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.