FlowLoss: Dynamic Flow-Conditioned Loss Strategy for Video Diffusion Models
- URL: http://arxiv.org/abs/2504.14535v1
- Date: Sun, 20 Apr 2025 08:22:29 GMT
- Title: FlowLoss: Dynamic Flow-Conditioned Loss Strategy for Video Diffusion Models
- Authors: Kuanting Wu, Kei Ota, Asako Kanezaki,
- Abstract summary: Video Diffusion Models (VDMs) can generate high-quality videos, but often struggle with producing temporally coherent motion.<n>We propose FlowLoss, which directly compares flow fields extracted from generated and ground-truth videos.<n>Our findings offer practical insights for incorporating motion-based supervision into noise-conditioned generative models.
- Score: 9.469635938429647
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Video Diffusion Models (VDMs) can generate high-quality videos, but often struggle with producing temporally coherent motion. Optical flow supervision is a promising approach to address this, with prior works commonly employing warping-based strategies that avoid explicit flow matching. In this work, we explore an alternative formulation, FlowLoss, which directly compares flow fields extracted from generated and ground-truth videos. To account for the unreliability of flow estimation under high-noise conditions in diffusion, we propose a noise-aware weighting scheme that modulates the flow loss across denoising steps. Experiments on robotic video datasets suggest that FlowLoss improves motion stability and accelerates convergence in early training stages. Our findings offer practical insights for incorporating motion-based supervision into noise-conditioned generative models.
Related papers
- MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space [40.60429652169086]
Text-conditioned streaming motion generation requires us to predict the next-step human pose based on variable-length historical motions and incoming texts.<n>Existing methods struggle to achieve streaming motion generation, e.g., diffusion models are constrained by pre-defined motion lengths.<n>We propose MotionStreamer, a novel framework that incorporates a continuous causal latent space into a probabilistic autoregressive model.
arXiv Detail & Related papers (2025-03-19T17:32:24Z) - FlowDPS: Flow-Driven Posterior Sampling for Inverse Problems [51.99765487172328]
Posterior sampling for inverse problem solving can be effectively achieved using flows.<n>Flow-Driven Posterior Sampling (FlowDPS) outperforms state-of-the-art alternatives.
arXiv Detail & Related papers (2025-03-11T07:56:14Z) - One-Step Diffusion Model for Image Motion-Deblurring [85.76149042561507]
We propose a one-step diffusion model for deblurring (OSDD), a novel framework that reduces the denoising process to a single step.
To tackle fidelity loss in diffusion models, we introduce an enhanced variational autoencoder (eVAE), which improves structural restoration.
Our method achieves strong performance on both full and no-reference metrics.
arXiv Detail & Related papers (2025-03-09T09:39:57Z) - Motion-Aware Generative Frame Interpolation [23.380470636851022]
Flow-based frame methods ensure motion stability through estimated intermediate flow but often introduce severe artifacts in complex motion regions.<n>Recent generative approaches, boosted by large-scale pre-trained video generation models, show promise in handling intricate scenes.<n>We propose Motion-aware Generative frame (MoG) that synergizes intermediate flow guidance with generative capacities to enhance fidelity.
arXiv Detail & Related papers (2025-01-07T11:03:43Z) - Video Motion Transfer with Diffusion Transformers [82.4796313201512]
We propose DiTFlow, a method for transferring the motion of a reference video to a newly synthesized one.<n>We first process the reference video with a pre-trained DiT to analyze cross-frame attention maps and extract a patch-wise motion signal.<n>We apply our strategy to transformer positional embeddings, granting us a boost in zero-shot motion transfer capabilities.
arXiv Detail & Related papers (2024-12-10T18:59:58Z) - Guided Flows for Generative Modeling and Decision Making [55.42634941614435]
We show that Guided Flows significantly improves the sample quality in conditional image generation and zero-shot text synthesis-to-speech.
Notably, we are first to apply flow models for plan generation in the offline reinforcement learning setting ax speedup in compared to diffusion models.
arXiv Detail & Related papers (2023-11-22T15:07:59Z) - Removing Structured Noise with Diffusion Models [13.50969999636388]
We show that the powerful paradigm of posterior sampling with diffusion models can be extended to include rich, structured, noise models.<n>We demonstrate strong performance gains across various inverse problems with structured noise, outperforming competitive baselines.<n>This opens up new opportunities and relevant practical applications of diffusion modeling for inverse problems in the context of non-Gaussian measurement models.
arXiv Detail & Related papers (2023-01-20T23:42:25Z) - Learning Task-Oriented Flows to Mutually Guide Feature Alignment in
Synthesized and Real Video Denoising [137.5080784570804]
Video denoising aims at removing noise from videos to recover clean ones.
Some existing works show that optical flow can help the denoising by exploiting the additional spatial-temporal clues from nearby frames.
We propose a new multi-scale refined optical flow-guided video denoising method, which is more robust to different noise levels.
arXiv Detail & Related papers (2022-08-25T00:09:18Z) - Self-Supervised Learning of Non-Rigid Residual Flow and Ego-Motion [63.18340058854517]
We present an alternative method for end-to-end scene flow learning by joint estimation of non-rigid residual flow and ego-motion flow for dynamic 3D scenes.
We extend the supervised framework with self-supervisory signals based on the temporal consistency property of a point cloud sequence.
arXiv Detail & Related papers (2020-09-22T11:39:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.