Inference-time Stochastic Refinement of GRU-Normalizing Flow for Real-time Video Motion Transfer
- URL: http://arxiv.org/abs/2512.04282v1
- Date: Wed, 03 Dec 2025 21:47:30 GMT
- Title: Inference-time Stochastic Refinement of GRU-Normalizing Flow for Real-time Video Motion Transfer
- Authors: Tasmiah Haque, Srinjoy Das,
- Abstract summary: Real-time video motion transfer applications such as immersive gaming require accurate yet diverse future predictions to support realistic synthesis and robust downstream decision making under uncertainty.<n>To improve the diversity of such sequential forecasts we propose a novel inference-time refinement technique that combines Gated Recurrent Unit-Normalizing Flows (GRU-NF) with sampling methods.<n>We show that our inference framework, Gated Recurrent Unit- Normalizing Flows (GRU-SNF) outperforms GRU-NF in generating diverse outputs without sacrificing accuracy, even under longer prediction horizons.
- Score: 0.7161783472741748
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Real-time video motion transfer applications such as immersive gaming and vision-based anomaly detection require accurate yet diverse future predictions to support realistic synthesis and robust downstream decision making under uncertainty. To improve the diversity of such sequential forecasts we propose a novel inference-time refinement technique that combines Gated Recurrent Unit-Normalizing Flows (GRU-NF) with stochastic sampling methods. While GRU-NF can capture multimodal distributions through its integration of normalizing flows within a temporal forecasting framework, its deterministic transformation structure can limit expressivity. To address this, inspired by Stochastic Normalizing Flows (SNF), we introduce Markov Chain Monte Carlo (MCMC) steps during GRU-NF inference, enabling the model to explore a richer output space and better approximate the true data distribution without retraining. We validate our approach in a keypoint-based video motion transfer pipeline, where capturing temporally coherent and perceptually diverse future trajectories is essential for realistic samples and low bandwidth communication. Experiments show that our inference framework, Gated Recurrent Unit- Stochastic Normalizing Flows (GRU-SNF) outperforms GRU-NF in generating diverse outputs without sacrificing accuracy, even under longer prediction horizons. By injecting stochasticity during inference, our approach captures multimodal behavior more effectively. These results highlight the potential of integrating stochastic dynamics with flow-based sequence models for generative time series forecasting.
Related papers
- Accelerated Sequential Flow Matching: A Bayesian Filtering Perspective [16.29333060724397]
We introduce Sequential Flow Matching, a principled framework grounded in Bayesian filtering.<n>By treating streaming inference as learning a probability flow that transports the predictive distribution from one time step to the next, our approach naturally aligns with the structure of Bayesian belief updates.<n>Our method achieves performance competitive with full-step diffusion while requiring only one or very few sampling steps, therefore with faster sampling.
arXiv Detail & Related papers (2026-02-05T05:37:14Z) - TimeGMM: Single-Pass Probabilistic Forecasting via Adaptive Gaussian Mixture Models with Reversible Normalization [23.69545884322676]
TimeGMM is a novel framework that captures complex future distributions in a single forward pass.<n>It consistently outperforms state-of-the-art methods, achieving maximum improvements of 22.48% in CRPS and 21.23% in NMAE.
arXiv Detail & Related papers (2026-01-18T07:02:13Z) - SimDiff: Simpler Yet Better Diffusion Model for Time Series Point Forecasting [8.141505251306622]
Diffusion models have recently shown promise in time series forecasting.<n>They often fail to achieve state-of-the-art point estimation performance.<n>We propose SimDiff, a single-stage, end-to-end framework for point estimation.
arXiv Detail & Related papers (2025-11-24T16:09:55Z) - Inference-Time Scaling of Diffusion Language Models with Particle Gibbs Sampling [70.8832906871441]
We study how to steer generation toward desired rewards without retraining the models.<n>Prior methods typically resample or filter within a single denoising trajectory, optimizing rewards step-by-step without trajectory-level refinement.<n>We introduce particle Gibbs sampling for diffusion language models (PG-DLM), a novel inference-time algorithm enabling trajectory-level refinement while preserving generation perplexity.
arXiv Detail & Related papers (2025-07-11T08:00:47Z) - Consistent World Models via Foresight Diffusion [56.45012929930605]
We argue that a key bottleneck in learning consistent diffusion-based world models lies in the suboptimal predictive ability.<n>We propose Foresight Diffusion (ForeDiff), a diffusion-based world modeling framework that enhances consistency by decoupling condition understanding from target denoising.
arXiv Detail & Related papers (2025-05-22T10:01:59Z) - Probabilistic Forecasting via Autoregressive Flow Matching [1.5467259918426441]
FlowTime is a generative model for probabilistic forecasting of timeseries data.<n>We decompose the joint distribution of future observations into a sequence of conditional densities, each modeled via a shared flow.<n>We demonstrate the effectiveness of FlowTime on multiple dynamical systems and real-world forecasting tasks.
arXiv Detail & Related papers (2025-03-13T13:54:24Z) - Physics-guided Active Sample Reweighting for Urban Flow Prediction [75.24539704456791]
Urban flow prediction is a nuanced-temporal modeling that estimates the throughput of transportation services like buses, taxis and ride-driven models.
Some recent prediction solutions bring remedies with the notion of physics-guided machine learning (PGML)
We develop a atized physics-guided network (PN), and propose a data-aware framework Physics-guided Active Sample Reweighting (P-GASR)
arXiv Detail & Related papers (2024-07-18T15:44:23Z) - GDTS: Goal-Guided Diffusion Model with Tree Sampling for Multi-Modal Pedestrian Trajectory Prediction [15.731398013255179]
We propose a novel Goal-Guided Diffusion Model with Tree Sampling for multi-modal trajectory prediction.<n>A two-stage tree sampling algorithm is presented, which leverages common features to reduce the inference time and improve accuracy for multi-modal prediction.<n> Experimental results demonstrate that our proposed framework achieves comparable state-of-the-art performance with real-time inference speed in public datasets.
arXiv Detail & Related papers (2023-11-25T03:55:06Z) - Generative Time Series Forecasting with Diffusion, Denoise, and
Disentanglement [51.55157852647306]
Time series forecasting has been a widely explored task of great importance in many applications.
It is common that real-world time series data are recorded in a short time period, which results in a big gap between the deep model and the limited and noisy time series.
We propose to address the time series forecasting problem with generative modeling and propose a bidirectional variational auto-encoder equipped with diffusion, denoise, and disentanglement.
arXiv Detail & Related papers (2023-01-08T12:20:46Z) - Flow-based Spatio-Temporal Structured Prediction of Motion Dynamics [21.24885597341643]
Conditional Flows (CNFs) are flexible generative models capable of representing complicated distributions with high dimensionality and interdimensional correlations.
We propose MotionFlow as a novel approach that autoregressively normalizes the output on the temporal input features.
We apply our method to different tasks, including prediction, motion prediction time series forecasting, and binary segmentation.
arXiv Detail & Related papers (2021-04-09T14:30:35Z) - Stochastically forced ensemble dynamic mode decomposition for
forecasting and analysis of near-periodic systems [65.44033635330604]
We introduce a novel load forecasting method in which observed dynamics are modeled as a forced linear system.
We show that its use of intrinsic linear dynamics offers a number of desirable properties in terms of interpretability and parsimony.
Results are presented for a test case using load data from an electrical grid.
arXiv Detail & Related papers (2020-10-08T20:25:52Z) - Semi-Supervised Learning with Normalizing Flows [54.376602201489995]
FlowGMM is an end-to-end approach to generative semi supervised learning with normalizing flows.
We show promising results on a wide range of applications, including AG-News and Yahoo Answers text data.
arXiv Detail & Related papers (2019-12-30T17:36:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.