LLVD: LSTM-based Explicit Motion Modeling in Latent Space for Blind Video Denoising
- URL: http://arxiv.org/abs/2501.05744v1
- Date: Fri, 10 Jan 2025 06:20:27 GMT
- Title: LLVD: LSTM-based Explicit Motion Modeling in Latent Space for Blind Video Denoising
- Authors: Loay Rashid, Siddharth Roheda, Amit Unde,
- Abstract summary: This paper introduces a novel algorithm designed for scenarios where noise is introduced during video capture.<n>We propose the Latent space LSTM Video Denoiser (LLVD), an end-to-end blind denoising model.<n> Experiments reveal that LLVD demonstrates excellent performance for both synthetic and captured noise.
- Score: 1.9253333342733672
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Video restoration plays a pivotal role in revitalizing degraded video content by rectifying imperfections caused by various degradations introduced during capturing (sensor noise, motion blur, etc.), saving/sharing (compression, resizing, etc.) and editing. This paper introduces a novel algorithm designed for scenarios where noise is introduced during video capture, aiming to enhance the visual quality of videos by reducing unwanted noise artifacts. We propose the Latent space LSTM Video Denoiser (LLVD), an end-to-end blind denoising model. LLVD uniquely combines spatial and temporal feature extraction, employing Long Short Term Memory (LSTM) within the encoded feature domain. This integration of LSTM layers is crucial for maintaining continuity and minimizing flicker in the restored video. Moreover, processing frames in the encoded feature domain significantly reduces computations, resulting in a very lightweight architecture. LLVD's blind nature makes it versatile for real, in-the-wild denoising scenarios where prior information about noise characteristics is not available. Experiments reveal that LLVD demonstrates excellent performance for both synthetic and captured noise. Specifically, LLVD surpasses the current State-Of-The-Art (SOTA) in RAW denoising by 0.3dB, while also achieving a 59\% reduction in computational complexity.
Related papers
- Implicit Neural Representation for Video Restoration [4.960738913876514]
We introduce VR-INR, a novel video restoration approach based on Implicit Neural Representations (INRs)<n>VR-INR generalizes effectively to arbitrary, unseen super-resolution scales at test time.<n>It consistently maintains high-quality reconstructions at unseen scales and noise during training.
arXiv Detail & Related papers (2025-06-05T18:09:59Z) - Motion-Aware Concept Alignment for Consistent Video Editing [57.08108545219043]
We introduce MoCA-Video (Motion-Aware Concept Alignment in Video), a training-free framework bridging the gap between image-domain semantic mixing and video.<n>Given a generated video and a user-provided reference image, MoCA-Video injects the semantic features of the reference image into a specific object within the video.<n>We evaluate MoCA's performance using the standard SSIM, image-level LPIPS, temporal LPIPS, and introduce a novel metric CASS (Conceptual Alignment Shift Score) to evaluate the consistency and effectiveness of the visual shifts between the source prompt and the modified video frames
arXiv Detail & Related papers (2025-06-01T13:28:04Z) - LightMotion: A Light and Tuning-free Method for Simulating Camera Motion in Video Generation [56.64004196498026]
LightMotion is a light and tuning-free method for simulating camera motion in video generation.
operating in the latent space, it eliminates additional fine-tuning, inpainting, and depth estimation.
arXiv Detail & Related papers (2025-03-09T08:28:40Z) - Token-Efficient Long Video Understanding for Multimodal LLMs [101.70681093383365]
STORM is a novel architecture incorporating a dedicated temporal encoder between the image encoder and the Video-LLMs.
We show that STORM achieves state-of-the-art results across various long video understanding benchmarks.
arXiv Detail & Related papers (2025-03-06T06:17:38Z) - Spatial Degradation-Aware and Temporal Consistent Diffusion Model for Compressed Video Super-Resolution [25.615935776826596]
Due to storage and bandwidth limitations, videos transmitted over the Internet often exhibit low quality, characterized by low-resolution and compression artifacts.<n>Although video super-resolution (VSR) is an efficient video enhancing technique, existing VS methods focus less on compressed videos.<n>We propose a novel method that exploits the priors of pre-trained diffusion models for compressed VSR.
arXiv Detail & Related papers (2025-02-11T08:57:45Z) - Denoising Reuse: Exploiting Inter-frame Motion Consistency for Efficient Video Latent Generation [36.098738197088124]
This work presents a Diffusion Reuse MOtion network to accelerate latent video generation.
coarse-grained noises in earlier denoising steps have demonstrated high motion consistency across consecutive video frames.
Dr. Mo propagates those coarse-grained noises onto the next frame by incorporating carefully designed, lightweight inter-frame motions.
arXiv Detail & Related papers (2024-09-19T07:50:34Z) - Video Dynamics Prior: An Internal Learning Approach for Robust Video
Enhancements [83.5820690348833]
We present a framework for low-level vision tasks that does not require any external training data corpus.
Our approach learns neural modules by optimizing over a corrupted sequence, leveraging the weights of the coherence-temporal test and statistics internal statistics.
arXiv Detail & Related papers (2023-12-13T01:57:11Z) - VideoFusion: Decomposed Diffusion Models for High-Quality Video
Generation [88.49030739715701]
This work presents a decomposed diffusion process via resolving the per-frame noise into a base noise that is shared among all frames and a residual noise that varies along the time axis.
Experiments on various datasets confirm that our approach, termed as VideoFusion, surpasses both GAN-based and diffusion-based alternatives in high-quality video generation.
arXiv Detail & Related papers (2023-03-15T02:16:39Z) - Learning Task-Oriented Flows to Mutually Guide Feature Alignment in
Synthesized and Real Video Denoising [137.5080784570804]
Video denoising aims at removing noise from videos to recover clean ones.
Some existing works show that optical flow can help the denoising by exploiting the additional spatial-temporal clues from nearby frames.
We propose a new multi-scale refined optical flow-guided video denoising method, which is more robust to different noise levels.
arXiv Detail & Related papers (2022-08-25T00:09:18Z) - PVDD: A Practical Video Denoising Dataset with Real-World Dynamic Scenes [56.4361151691284]
"Practical Video Denoising dataset" (PVDD) contains 200 noisy-clean dynamic video pairs in both sRGB and RAW format.
Compared with existing datasets consisting of limited motion information,PVDD covers dynamic scenes with varying natural motion.
arXiv Detail & Related papers (2022-07-04T12:30:22Z) - Neural Compression-Based Feature Learning for Video Restoration [29.021502115116736]
This paper proposes learning noise-robust feature representations to help video restoration.
We design a neural compression module to filter the noise and keep the most useful information in features for video restoration.
arXiv Detail & Related papers (2022-03-17T09:59:26Z) - Noisy-LSTM: Improving Temporal Awareness for Video Semantic Segmentation [29.00635219317848]
This paper presents a new model named Noisy-LSTM, which is trainable in an end-to-end manner.
We also present a simple yet effective training strategy, which replaces a frame in video sequence with noises.
arXiv Detail & Related papers (2020-10-19T13:08:15Z) - First image then video: A two-stage network for spatiotemporal video
denoising [19.842488445174524]
Video denoising is to remove noise from noise-corrupted data, thus recovering true motion signals.
Existing approaches for video denoising tend to suffer from blur artifacts, that is the boundary of a moving object tends to appear blurry.
This paper introduces a first-image-then-video two-stage denoising neural network, consisting of an image denoising module and a regular intratemporal video denoising module.
It yields state-of-the-art performances on the video denoising Vimeo90K dataset in terms of both denoising quality and computation.
arXiv Detail & Related papers (2020-01-02T07:21:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.