Related papers: Video Demoireing with Relation-Based Temporal Consistency

Video Demoireing with Relation-Based Temporal Consistency

URL: http://arxiv.org/abs/2204.02957v1
Date: Wed, 6 Apr 2022 17:45:38 GMT
Title: Video Demoireing with Relation-Based Temporal Consistency
Authors: Peng Dai, Xin Yu, Lan Ma, Baoheng Zhang, Jia Li, Wenbo Li, Jiajun Shen, Xiaojuan Qi
Abstract summary: Moire patterns, appearing as color distortions, severely degrade image and video qualities when filming a screen with digital cameras. We study how to remove such undesirable moire patterns in videos, namely video demoireing.
Score: 68.20281109859998
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Moire patterns, appearing as color distortions, severely degrade image and video qualities when filming a screen with digital cameras. Considering the increasing demands for capturing videos, we study how to remove such undesirable moire patterns in videos, namely video demoireing. To this end, we introduce the first hand-held video demoireing dataset with a dedicated data collection pipeline to ensure spatial and temporal alignments of captured data. Further, a baseline video demoireing model with implicit feature space alignment and selective feature aggregation is developed to leverage complementary information from nearby frames to improve frame-level video demoireing. More importantly, we propose a relation-based temporal consistency loss to encourage the model to learn temporal consistency priors directly from ground-truth reference videos, which facilitates producing temporally consistent predictions and effectively maintains frame-level qualities. Extensive experiments manifest the superiority of our model. Code is available at \url{https://daipengwa.github.io/VDmoire_ProjectPage/}.

Related papers

Tracktention: Leveraging Point Tracking to Attend Videos Faster and Better [61.381599921020175]
Temporal consistency is critical in video prediction to ensure that outputs are coherent and free of artifacts. Traditional methods, such as temporal attention and 3D convolution, may struggle with significant object motion. We propose the Tracktention Layer, a novel architectural component that explicitly integrates motion information using point tracks.
arXiv Detail & Related papers (2025-03-25T17:58:48Z)
Latent-Reframe: Enabling Camera Control for Video Diffusion Model without Training [51.851390459940646]
We introduce Latent-Reframe, which enables camera control in a pre-trained video diffusion model without fine-tuning. Latent-Reframe operates during the sampling stage, maintaining efficiency while preserving the original model distribution. Our approach reframes the latent code of video frames to align with the input camera trajectory through time-aware point clouds.
arXiv Detail & Related papers (2024-12-08T18:59:54Z)
Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors [54.8852848659663]
Buffer Anytime is a framework for estimation of depth and normal maps (which we call geometric buffers) from video. We demonstrate high-quality video buffer estimation by leveraging single-image priors with temporal consistency constraints.
arXiv Detail & Related papers (2024-11-26T09:28:32Z)
ZeroSmooth: Training-free Diffuser Adaptation for High Frame Rate Video Generation [81.90265212988844]
We propose a training-free video method for generative video models in a plug-and-play manner. We transform a video model into a self-cascaded video diffusion model with the designed hidden state correction modules. Our training-free method is even comparable to trained models supported by huge compute resources and large-scale datasets.
arXiv Detail & Related papers (2024-06-03T00:31:13Z)
VidToMe: Video Token Merging for Zero-Shot Video Editing [100.79999871424931]
We propose a novel approach to enhance temporal consistency in generated videos by merging self-attention tokens across frames. Our method improves temporal coherence and reduces memory consumption in self-attention computations.
arXiv Detail & Related papers (2023-12-17T09:05:56Z)
Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation [55.36617538438858]
We propose a novel approach that strengthens the interaction between spatial and temporal perceptions. We curate a large-scale and open-source video dataset called HD-VG-130M.
arXiv Detail & Related papers (2023-05-18T11:06:15Z)
Learning Fine-Grained Visual Understanding for Video Question Answering via Decoupling Spatial-Temporal Modeling [28.530765643908083]
We decouple spatial-temporal modeling and integrate an image- and a video-language to learn fine-grained visual understanding. We propose a novel pre-training objective, Temporal Referring Modeling, which requires the model to identify temporal positions of events in video sequences. Our model outperforms previous work pre-trained on orders of magnitude larger datasets.
arXiv Detail & Related papers (2022-10-08T07:03:31Z)
Video Salient Object Detection via Contrastive Features and Attention Modules [106.33219760012048]
We propose a network with attention modules to learn contrastive features for video salient object detection. A co-attention formulation is utilized to combine the low-level and high-level features. We show that the proposed method requires less computation, and performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-03T17:40:32Z)
Temporally stable video segmentation without video annotations [6.184270985214255]
We introduce a method to adapt still image segmentation models to video in an unsupervised manner. We verify that the consistency measure is well correlated with human judgement via a user study. We observe improvements in the generated segmented videos with minimal loss of accuracy.
arXiv Detail & Related papers (2021-10-17T18:59:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.