Video Demoireing with Relation-Based Temporal Consistency
- URL: http://arxiv.org/abs/2204.02957v1
- Date: Wed, 6 Apr 2022 17:45:38 GMT
- Title: Video Demoireing with Relation-Based Temporal Consistency
- Authors: Peng Dai, Xin Yu, Lan Ma, Baoheng Zhang, Jia Li, Wenbo Li, Jiajun
Shen, Xiaojuan Qi
- Abstract summary: Moire patterns, appearing as color distortions, severely degrade image and video qualities when filming a screen with digital cameras.
We study how to remove such undesirable moire patterns in videos, namely video demoireing.
- Score: 68.20281109859998
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Moire patterns, appearing as color distortions, severely degrade image and
video qualities when filming a screen with digital cameras. Considering the
increasing demands for capturing videos, we study how to remove such
undesirable moire patterns in videos, namely video demoireing. To this end, we
introduce the first hand-held video demoireing dataset with a dedicated data
collection pipeline to ensure spatial and temporal alignments of captured data.
Further, a baseline video demoireing model with implicit feature space
alignment and selective feature aggregation is developed to leverage
complementary information from nearby frames to improve frame-level video
demoireing. More importantly, we propose a relation-based temporal consistency
loss to encourage the model to learn temporal consistency priors directly from
ground-truth reference videos, which facilitates producing temporally
consistent predictions and effectively maintains frame-level qualities.
Extensive experiments manifest the superiority of our model. Code is available
at \url{https://daipengwa.github.io/VDmoire_ProjectPage/}.
Related papers
- Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors [54.8852848659663]
Buffer Anytime is a framework for estimation of depth and normal maps (which we call geometric buffers) from video.
We demonstrate high-quality video buffer estimation by leveraging single-image priors with temporal consistency constraints.
arXiv Detail & Related papers (2024-11-26T09:28:32Z) - ZeroSmooth: Training-free Diffuser Adaptation for High Frame Rate Video Generation [81.90265212988844]
We propose a training-free video method for generative video models in a plug-and-play manner.
We transform a video model into a self-cascaded video diffusion model with the designed hidden state correction modules.
Our training-free method is even comparable to trained models supported by huge compute resources and large-scale datasets.
arXiv Detail & Related papers (2024-06-03T00:31:13Z) - VidToMe: Video Token Merging for Zero-Shot Video Editing [100.79999871424931]
We propose a novel approach to enhance temporal consistency in generated videos by merging self-attention tokens across frames.
Our method improves temporal coherence and reduces memory consumption in self-attention computations.
arXiv Detail & Related papers (2023-12-17T09:05:56Z) - Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation [55.36617538438858]
We propose a novel approach that strengthens the interaction between spatial and temporal perceptions.
We curate a large-scale and open-source video dataset called HD-VG-130M.
arXiv Detail & Related papers (2023-05-18T11:06:15Z) - Learning Fine-Grained Visual Understanding for Video Question Answering
via Decoupling Spatial-Temporal Modeling [28.530765643908083]
We decouple spatial-temporal modeling and integrate an image- and a video-language to learn fine-grained visual understanding.
We propose a novel pre-training objective, Temporal Referring Modeling, which requires the model to identify temporal positions of events in video sequences.
Our model outperforms previous work pre-trained on orders of magnitude larger datasets.
arXiv Detail & Related papers (2022-10-08T07:03:31Z) - Video Salient Object Detection via Contrastive Features and Attention
Modules [106.33219760012048]
We propose a network with attention modules to learn contrastive features for video salient object detection.
A co-attention formulation is utilized to combine the low-level and high-level features.
We show that the proposed method requires less computation, and performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-03T17:40:32Z) - Temporally stable video segmentation without video annotations [6.184270985214255]
We introduce a method to adapt still image segmentation models to video in an unsupervised manner.
We verify that the consistency measure is well correlated with human judgement via a user study.
We observe improvements in the generated segmented videos with minimal loss of accuracy.
arXiv Detail & Related papers (2021-10-17T18:59:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.