Generating time-consistent dynamics with discriminator-guided image diffusion models
- URL: http://arxiv.org/abs/2505.09089v2
- Date: Thu, 15 May 2025 00:55:20 GMT
- Title: Generating time-consistent dynamics with discriminator-guided image diffusion models
- Authors: Philipp Hess, Maximilian Gelbrecht, Christof Schötz, Michael Aich, Yu Huang, Shangshang Yang, Niklas Boers,
- Abstract summary: temporal dynamics are crucial for many video generation, processing and modelling applications.<n>Video diffusion models (VDMs) are the current state-of-the-art method for generating highly realistic dynamics.<n>Here, we propose a time-consistency discriminator that enables pretrained image diffusion models to generate realistic dynamics.
- Score: 2.5592599835023067
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Realistic temporal dynamics are crucial for many video generation, processing and modelling applications, e.g. in computational fluid dynamics, weather prediction, or long-term climate simulations. Video diffusion models (VDMs) are the current state-of-the-art method for generating highly realistic dynamics. However, training VDMs from scratch can be challenging and requires large computational resources, limiting their wider application. Here, we propose a time-consistency discriminator that enables pretrained image diffusion models to generate realistic spatiotemporal dynamics. The discriminator guides the sampling inference process and does not require extensions or finetuning of the image diffusion model. We compare our approach against a VDM trained from scratch on an idealized turbulence simulation and a real-world global precipitation dataset. Our approach performs equally well in terms of temporal consistency, shows improved uncertainty calibration and lower biases compared to the VDM, and achieves stable centennial-scale climate simulations at daily time steps.
Related papers
- FlowMo: Variance-Based Flow Guidance for Coherent Motion in Video Generation [51.110607281391154]
FlowMo is a training-free guidance method for enhancing motion coherence in text-to-video models.<n>It estimates motion coherence by measuring the patch-wise variance across the temporal dimension and guides the model to reduce this variance dynamically during sampling.
arXiv Detail & Related papers (2025-06-01T19:55:33Z) - Generative Pre-trained Autoregressive Diffusion Transformer [54.476056835275415]
GPDiT is a Generative Pre-trained Autoregressive Diffusion Transformer.<n>It unifies the strengths of diffusion and autoregressive modeling for long-range video synthesis.<n>It autoregressively predicts future latent frames using a diffusion loss, enabling natural modeling of motion dynamics.
arXiv Detail & Related papers (2025-05-12T08:32:39Z) - Dynamical Diffusion: Learning Temporal Dynamics with Diffusion Models [71.63194926457119]
We introduce Dynamical Diffusion (DyDiff), a theoretically sound framework that incorporates temporally aware forward and reverse processes.<n>Experiments across scientifictemporal forecasting, video prediction, and time series forecasting demonstrate that Dynamical Diffusion consistently improves performance in temporal predictive tasks.
arXiv Detail & Related papers (2025-03-02T16:10:32Z) - TAUDiff: Highly efficient kilometer-scale downscaling using generative diffusion models [0.0]
It is crucial to achieve rapid turnaround, dynamical consistency, and accurate-temporal recovery for extreme weather events.<n>We propose an efficient diffusion model TAUDiff, that combines a deterministic-temporal model for mean field downscaling with a smaller generative diffusion model for recovering the fine-scale features.<n>Our approach can ensure quicker simulation of extreme events necessary for estimating associated risks and economic losses.
arXiv Detail & Related papers (2024-12-18T09:05:19Z) - ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer [95.80384464922147]
ACDiT is a blockwise Conditional Diffusion Transformer.<n>It offers a flexible between token-wise autoregression and full-sequence diffusion.<n>We show that ACDiT performs best among all autoregressive baselines on image and video generation tasks.
arXiv Detail & Related papers (2024-12-10T18:13:20Z) - Trajectory Flow Matching with Applications to Clinical Time Series Modeling [77.58277281319253]
Trajectory Flow Matching (TFM) trains a Neural SDE in a simulation-free manner, bypassing backpropagation through the dynamics.<n>We demonstrate improved performance on three clinical time series datasets in terms of absolute performance and uncertainty prediction.
arXiv Detail & Related papers (2024-10-28T15:54:50Z) - Sequential Posterior Sampling with Diffusion Models [15.028061496012924]
We propose a novel approach that models the transition dynamics to improve the efficiency of sequential diffusion posterior sampling in conditional image synthesis.
We demonstrate the effectiveness of our approach on a real-world dataset of high frame rate cardiac ultrasound images.
Our method opens up new possibilities for real-time applications of diffusion models in imaging and other domains requiring real-time inference.
arXiv Detail & Related papers (2024-09-09T07:55:59Z) - Unfolding Time: Generative Modeling for Turbulent Flows in 4D [49.843505326598596]
This work introduces a 4D generative diffusion model and a physics-informed guidance technique that enables the generation of realistic sequences of flow states.
Our findings indicate that the proposed method can successfully sample entire subsequences from the turbulent manifold.
This advancement opens doors for the application of generative modeling in analyzing the temporal evolution of turbulent flows.
arXiv Detail & Related papers (2024-06-17T10:21:01Z) - DYffusion: A Dynamics-informed Diffusion Model for Spatiotemporal
Forecasting [18.86526240105348]
We propose an approach for efficiently training diffusion models for probabilistic forecasting.
We train a time-conditioned interpolator and a forecaster network that mimic the forward and reverse processes of standard diffusion models.
Our approach performs competitively on probabilistic forecasting of complex dynamics in sea surface temperatures, Navier-Stokes flows, and flows spring systems.
arXiv Detail & Related papers (2023-06-03T02:46:31Z) - DiffESM: Conditional Emulation of Earth System Models with Diffusion
Models [2.1989764549743476]
A key application of Earth System Models (ESMs) is studying extreme weather events, such as heat waves or dry spells.
We show that diffusion models can effectively emulate the trends of ESMs under previously unseen climate scenarios.
arXiv Detail & Related papers (2023-04-23T17:12:33Z) - Neural Continuous-Discrete State Space Models for Irregularly-Sampled
Time Series [18.885471782270375]
NCDSSM employs auxiliary variables to disentangle recognition from dynamics, thus requiring amortized inference only for the auxiliary variables.
We propose three flexible parameterizations of the latent dynamics and an efficient training objective that marginalizes the dynamic states during inference.
Empirical results on multiple benchmark datasets show improved imputation and forecasting performance of NCDSSM over existing models.
arXiv Detail & Related papers (2023-01-26T18:45:04Z) - Diffusion Glancing Transformer for Parallel Sequence to Sequence
Learning [52.72369034247396]
We propose the diffusion glancing transformer, which employs a modality diffusion process and residual glancing sampling.
DIFFGLAT achieves better generation accuracy while maintaining fast decoding speed compared with both autoregressive and non-autoregressive models.
arXiv Detail & Related papers (2022-12-20T13:36:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.