Related papers: In-Context Planning with Latent Temporal Abstractions

In-Context Planning with Latent Temporal Abstractions

URL: http://arxiv.org/abs/2602.18694v1
Date: Sat, 21 Feb 2026 02:29:19 GMT
Title: In-Context Planning with Latent Temporal Abstractions
Authors: Baiting Luo, Yunuo Zhang, Nathaniel S. Keplinger, Samir Gupta, Abhishek Dubey, Ayan Mukhopadhyay,
Abstract summary: I-TAP is an offline RL framework that unifies in-context adaptation with online planning in a learned temporal-abstraction space.<n>I-TAP consistently matches or outperforms strong model-free and model-conditioned offline baselines.
Score: 7.0210868165244875
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Planning-based reinforcement learning for continuous control is bottlenecked by two practical issues: planning at primitive time scales leads to prohibitive branching and long horizons, while real environments are frequently partially observable and exhibit regime shifts that invalidate stationary, fully observed dynamics assumptions. We introduce I-TAP (In-Context Latent Temporal-Abstraction Planner), an offline RL framework that unifies in-context adaptation with online planning in a learned discrete temporal-abstraction space. From offline trajectories, I-TAP learns an observation-conditioned residual-quantization VAE that compresses each observation-macro-action segment into a coarse-to-fine stack of discrete residual tokens, and a temporal Transformer that autoregressively predicts these token stacks from a short recent history. The resulting sequence model acts simultaneously as a context-conditioned prior over abstract actions and a latent dynamics model. At test time, I-TAP performs Monte Carlo Tree Search directly in token space, using short histories for implicit adaptation without gradient update, and decodes selected token stacks into executable actions. Across deterministic MuJoCo, stochastic MuJoCo with per-episode latent dynamics regimes, and high-dimensional Adroit manipulation, including partially observable variants, I-TAP consistently matches or outperforms strong model-free and model-based offline baselines, demonstrating efficient and robust in-context planning under stochastic dynamics and partial observability.

Related papers

LatentTrack: Sequential Weight Generation via Latent Filtering [0.0]
LatentTrack (LT) is a sequential neural architecture for online probabilistic prediction under nonstationary dynamics.<n> LT performs causal Bayesian filtering in a low-dimensional latent space and uses a lightweight hypernetwork to generate predictive model parameters at each time step.<n> LT consistently calibrated lower negative log-likelihood and mean squared error than stateful sequential and static uncertainty-aware baselines.
arXiv Detail & Related papers (2026-01-31T02:22:59Z)
Temporal Graph Pattern Machine [17.352525018007473]
Temporal Graph Pattern Machine (TGPM) conceptualizes each interaction as an interaction patch synthesized via temporally-biased random walks.<n>TGPM consistently achieves state-of-the-art performance in both transductive and inductive link prediction.
arXiv Detail & Related papers (2026-01-30T01:46:13Z)
Drift No More? Context Equilibria in Multi-Turn LLM Interactions [58.69551510148673]
contexts drift is the gradual divergence of a model's outputs from goal-consistent behavior across turns.<n>Unlike single-turn errors, drift unfolds temporally and is poorly captured by static evaluation metrics.<n>We show that multi-turn drift can be understood as a controllable equilibrium phenomenon rather than as inevitable decay.
arXiv Detail & Related papers (2025-10-09T04:48:49Z)
Parallel Test-Time Scaling for Latent Reasoning Models [58.428340345068214]
Parallel test-time scaling (TTS) is a pivotal approach for enhancing large language models (LLMs)<n>Recent advances in latent reasoning, where intermediate reasoning unfolds in continuous vector spaces, offer a more efficient alternative to explicit Chain-of-Thought.<n>This work enables parallel TTS for latent reasoning models by addressing the above issues.
arXiv Detail & Related papers (2025-10-09T03:33:00Z)
Scalable Decision-Making in Stochastic Environments through Learned Temporal Abstraction [7.918703013303246]
We present Latent Macro Action Planner (L-MAP), which addresses the challenge of learning to make decisions in high-dimensional continuous action spaces.<n>L-MAP learns a set of temporally extended macro-actions through a state-conditional Vector Quantized Variational Autoencoder (VQ-VAE)<n>In offline RL settings, including continuous control tasks, L-MAP efficiently searches over discrete latent actions to yield high expected returns.
arXiv Detail & Related papers (2025-02-28T16:02:23Z)
Amortized Control of Continuous State Space Feynman-Kac Model for Irregular Time Series [14.400596021890863]
Many real-world datasets, such as healthcare, climate, and economics, are often collected as irregular time series.<n>We propose the Amortized Control of continuous State Space Model (ACSSM) for continuous dynamical modeling of time series.
arXiv Detail & Related papers (2024-10-08T01:27:46Z)
A Poisson-Gamma Dynamic Factor Model with Time-Varying Transition Dynamics [51.147876395589925]
A non-stationary PGDS is proposed to allow the underlying transition matrices to evolve over time. A fully-conjugate and efficient Gibbs sampler is developed to perform posterior simulation. Experiments show that, in comparison with related models, the proposed non-stationary PGDS achieves improved predictive performance.
arXiv Detail & Related papers (2024-02-26T04:39:01Z)
Continuous Latent Process Flows [47.267251969492484]
Partial observations of continuous time-series dynamics at arbitrary time stamps exist in many disciplines. Fitting this type of data using statistical models with continuous dynamics is not only promising at an intuitive level but also has practical benefits. We tackle these challenges with continuous latent process flows (CLPF), a principled architecture decoding continuous latent processes into continuous observable processes using a time-dependent normalizing flow driven by a differential equation. Our ablation studies demonstrate the effectiveness of our contributions in various inference tasks on irregular time grids.
arXiv Detail & Related papers (2021-06-29T17:16:04Z)
Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization [60.73540999409032]
We show that expressive autoregressive dynamics models generate different dimensions of the next state and reward sequentially conditioned on previous dimensions. We also show that autoregressive dynamics models are useful for offline policy optimization by serving as a way to enrich the replay buffer.
arXiv Detail & Related papers (2021-04-28T16:48:44Z)
Stochastically forced ensemble dynamic mode decomposition for forecasting and analysis of near-periodic systems [65.44033635330604]
We introduce a novel load forecasting method in which observed dynamics are modeled as a forced linear system. We show that its use of intrinsic linear dynamics offers a number of desirable properties in terms of interpretability and parsimony. Results are presented for a test case using load data from an electrical grid.
arXiv Detail & Related papers (2020-10-08T20:25:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.