LightMotion: A Light and Tuning-free Method for Simulating Camera Motion in Video Generation
- URL: http://arxiv.org/abs/2503.06508v2
- Date: Tue, 11 Mar 2025 02:28:14 GMT
- Title: LightMotion: A Light and Tuning-free Method for Simulating Camera Motion in Video Generation
- Authors: Quanjian Song, Zhihang Lin, Zhanpeng Zeng, Ziyue Zhang, Liujuan Cao, Rongrong Ji,
- Abstract summary: LightMotion is a light and tuning-free method for simulating camera motion in video generation.<n> operating in the latent space, it eliminates additional fine-tuning, inpainting, and depth estimation.
- Score: 56.64004196498026
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing camera motion-controlled video generation methods face computational bottlenecks in fine-tuning and inference. This paper proposes LightMotion, a light and tuning-free method for simulating camera motion in video generation. Operating in the latent space, it eliminates additional fine-tuning, inpainting, and depth estimation, making it more streamlined than existing methods. The endeavors of this paper comprise: (i) The latent space permutation operation effectively simulates various camera motions like panning, zooming, and rotation. (ii) The latent space resampling strategy combines background-aware sampling and cross-frame alignment to accurately fill new perspectives while maintaining coherence across frames. (iii) Our in-depth analysis shows that the permutation and resampling cause an SNR shift in latent space, leading to poor-quality generation. To address this, we propose latent space correction, which reintroduces noise during denoising to mitigate SNR shift and enhance video generation quality. Exhaustive experiments show that our LightMotion outperforms existing methods, both quantitatively and qualitatively.
Related papers
- MoBGS: Motion Deblurring Dynamic 3D Gaussian Splatting for Blurry Monocular Video [26.468480933928458]
MoBGS reconstructs sharp and high-quality views from blurry monocular videos in an end-to-end manner.
We introduce a Blur-adaptive Latent Camera Estimation (BLCE) method for effective latent camera trajectory estimation.
We also propose a physically-inspired Latent Camera-inspired Exposure Estimation (LCEE) method to ensure consistent deblurring of both global camera and local object motion.
arXiv Detail & Related papers (2025-04-21T14:19:19Z) - CamMimic: Zero-Shot Image To Camera Motion Personalized Video Generation Using Diffusion Models [47.65379612084075]
CamMimic is designed to seamlessly transfer the camera motion observed in a given reference video onto any scene of the user's choice.
In the absence of an established metric for assessing camera motion transfer between unrelated scenes, we propose CameraScore.
arXiv Detail & Related papers (2025-04-13T08:04:11Z) - Image Motion Blur Removal in the Temporal Dimension with Video Diffusion Models [3.052019331122618]
We propose a novel single-image deblurring approach that treats motion blur as a temporal averaging phenomenon.<n>Our core innovation lies in leveraging a pre-trained video diffusion transformer model to capture diverse motion dynamics.<n> Empirical results on synthetic and real-world datasets demonstrate that our method outperforms existing techniques in deblurring complex motion blur scenarios.
arXiv Detail & Related papers (2025-01-22T03:01:54Z) - Unrolled Decomposed Unpaired Learning for Controllable Low-Light Video Enhancement [48.76608212565327]
This paper makes endeavors in the direction of learning for low-light video enhancement without using paired ground truth.
Compared to low-light image enhancement, enhancing low-light videos is more difficult due to the intertwined effects of noise, exposure, and contrast in the spatial domain, jointly with the need for temporal coherence.
We propose the Unrolled Decomposed Unpaired Network (UDU-Net) for enhancing low-light videos by unrolling the optimization functions into a deep network to decompose the signal into spatial and temporal-related factors, which are updated iteratively.
arXiv Detail & Related papers (2024-08-22T11:45:11Z) - Motion-adaptive Separable Collaborative Filters for Blind Motion Deblurring [71.60457491155451]
Eliminating image blur produced by various kinds of motion has been a challenging problem.
We propose a novel real-world deblurring filtering model called the Motion-adaptive Separable Collaborative Filter.
Our method provides an effective solution for real-world motion blur removal and achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-04-19T19:44:24Z) - Gaussian Splatting on the Move: Blur and Rolling Shutter Compensation for Natural Camera Motion [25.54868552979793]
We present a method that adapts to camera motion and allows high-quality scene reconstruction with handheld video data.
Our results with both synthetic and real data demonstrate superior performance in mitigating camera motion over existing methods.
arXiv Detail & Related papers (2024-03-20T06:19:41Z) - SMURF: Continuous Dynamics for Motion-Deblurring Radiance Fields [14.681688453270523]
We propose sequential motion understanding radiance fields (SMURF), a novel approach that employs neural ordinary differential equation (Neural-ODE) to model continuous camera motion.
Our model, rigorously evaluated against benchmark datasets, demonstrates state-of-the-art performance both quantitatively and qualitatively.
arXiv Detail & Related papers (2024-03-12T11:32:57Z) - EulerMormer: Robust Eulerian Motion Magnification via Dynamic Filtering
within Transformer [30.470336098766765]
Video Motion Magnification (VMM) aims to break the resolution limit of human visual perception capability.
This paper proposes a novel dynamic filtering strategy to achieve static-dynamic field adaptive denoising.
We demonstrate extensive experiments that EulerMormer achieves more robust video motion magnification from the Eulerian perspective.
arXiv Detail & Related papers (2023-12-07T09:10:16Z) - DynaMoN: Motion-Aware Fast and Robust Camera Localization for Dynamic Neural Radiance Fields [71.94156412354054]
We propose Dynamic Motion-Aware Fast and Robust Camera Localization for Dynamic Neural Radiance Fields (DynaMoN)
DynaMoN handles dynamic content for initial camera pose estimation and statics-focused ray sampling for fast and accurate novel-view synthesis.
We extensively evaluate our approach on two real-world dynamic datasets, the TUM RGB-D dataset and the BONN RGB-D Dynamic dataset.
arXiv Detail & Related papers (2023-09-16T08:46:59Z) - LaMD: Latent Motion Diffusion for Image-Conditional Video Generation [63.34574080016687]
latent motion diffusion (LaMD) framework consists of a motion-decomposed video autoencoder and a diffusion-based motion generator.
LaMD generates high-quality videos on various benchmark datasets, including BAIR, Landscape, NATOPS, MUG and CATER-GEN.
arXiv Detail & Related papers (2023-04-23T10:32:32Z) - Towards Interpretable Video Super-Resolution via Alternating
Optimization [115.85296325037565]
We study a practical space-time video super-resolution (STVSR) problem which aims at generating a high-framerate high-resolution sharp video from a low-framerate blurry video.
We propose an interpretable STVSR framework by leveraging both model-based and learning-based methods.
arXiv Detail & Related papers (2022-07-21T21:34:05Z) - Learning Spatial and Spatio-Temporal Pixel Aggregations for Image and
Video Denoising [104.59305271099967]
We present a pixel aggregation network and learn the pixel sampling and averaging strategies for image denoising.
We develop a pixel aggregation network for video denoising to sample pixels across the spatial-temporal space.
Our method is able to solve the misalignment issues caused by large motion in dynamic scenes.
arXiv Detail & Related papers (2021-01-26T13:00:46Z) - Spatiotemporal Bundle Adjustment for Dynamic 3D Human Reconstruction in
the Wild [49.672487902268706]
We present a framework that jointly estimates camera temporal alignment and 3D point triangulation.
We reconstruct 3D motion trajectories of human bodies in events captured by multiple unsynchronized and unsynchronized video cameras.
arXiv Detail & Related papers (2020-07-24T23:50:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.