FDAN: Flow-guided Deformable Alignment Network for Video
Super-Resolution
- URL: http://arxiv.org/abs/2105.05640v1
- Date: Wed, 12 May 2021 13:18:36 GMT
- Title: FDAN: Flow-guided Deformable Alignment Network for Video
Super-Resolution
- Authors: Jiayi Lin, Yan Huang, Liang Wang
- Abstract summary: Flow-guided Deformable Module (FDM) is proposed to integrate optical flow into deformable convolution.
FDAN reaches the state-of-the-art performance on two benchmark datasets.
- Score: 12.844337773258678
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most Video Super-Resolution (VSR) methods enhance a video reference frame by
aligning its neighboring frames and mining information on these frames.
Recently, deformable alignment has drawn extensive attention in VSR community
for its remarkable performance, which can adaptively align neighboring frames
with the reference one. However, we experimentally find that deformable
alignment methods still suffer from fast motion due to locally loss-driven
offset prediction and lack explicit motion constraints. Hence, we propose a
Matching-based Flow Estimation (MFE) module to conduct global semantic feature
matching and estimate optical flow as coarse offset for each location. And a
Flow-guided Deformable Module (FDM) is proposed to integrate optical flow into
deformable convolution. The FDM uses the optical flow to warp the neighboring
frames at first. And then, the warped neighboring frames and the reference one
are used to predict a set of fine offsets for each coarse offset. In general,
we propose an end-to-end deep network called Flow-guided Deformable Alignment
Network (FDAN), which reaches the state-of-the-art performance on two benchmark
datasets while is still competitive in computation and memory consumption.
Related papers
- Deformable Feature Alignment and Refinement for Moving Infrared Dim-small Target Detection [17.765101100010224]
We propose a Deformable Feature Alignment and Refinement (DFAR) method based on deformable convolution to explicitly use motion context in both the training and inference stages.
The proposed DFAR method achieves the state-of-the-art performance on two benchmark datasets including DAUB and IRDST.
arXiv Detail & Related papers (2024-07-10T00:42:25Z) - OCAI: Improving Optical Flow Estimation by Occlusion and Consistency Aware Interpolation [55.676358801492114]
We propose OCAI, a method that supports robust frame ambiguities by generating intermediate video frames alongside optical flows in between.
Our evaluations demonstrate superior quality and enhanced optical flow accuracy on established benchmarks such as Sintel and KITTI.
arXiv Detail & Related papers (2024-03-26T20:23:48Z) - Motion-Aware Video Frame Interpolation [49.49668436390514]
We introduce a Motion-Aware Video Frame Interpolation (MA-VFI) network, which directly estimates intermediate optical flow from consecutive frames.
It not only extracts global semantic relationships and spatial details from input frames with different receptive fields, but also effectively reduces the required computational cost and complexity.
arXiv Detail & Related papers (2024-02-05T11:00:14Z) - IDO-VFI: Identifying Dynamics via Optical Flow Guidance for Video Frame
Interpolation with Events [14.098949778274733]
Event cameras are ideal for capturing inter-frame dynamics with their extremely high temporal resolution.
We propose an event-and-frame-based video frame method named IDO-VFI that assigns varying amounts of computation for different sub-regions.
Our proposed method maintains high-quality performance while reducing computation time and computational effort by 10% and 17% respectively on Vimeo90K datasets.
arXiv Detail & Related papers (2023-05-17T13:22:21Z) - Continuous Space-Time Video Super-Resolution Utilizing Long-Range
Temporal Information [48.20843501171717]
We propose a continuous ST-VSR (CSTVSR) method that can convert the given video to any frame rate and spatial resolution.
We show that the proposed algorithm has good flexibility and achieves better performance on various datasets.
arXiv Detail & Related papers (2023-02-26T08:02:39Z) - DeMFI: Deep Joint Deblurring and Multi-Frame Interpolation with
Flow-Guided Attentive Correlation and Recursive Boosting [50.17500790309477]
DeMFI-Net is a joint deblurring and multi-frame framework.
It converts blurry videos of lower-frame-rate to sharp videos at higher-frame-rate.
It achieves state-of-the-art (SOTA) performances for diverse datasets.
arXiv Detail & Related papers (2021-11-19T00:00:15Z) - Enhanced Correlation Matching based Video Frame Interpolation [5.304928339627251]
We propose a novel framework called the Enhanced Correlation Matching based Video Frame Interpolation Network.
The proposed scheme employs the recurrent pyramid architecture that shares the parameters among each pyramid layer for optical flow estimation.
Experiment results demonstrate that the proposed scheme outperforms the previous works at 4K video data and low-resolution benchmark datasets as well as in terms of objective and subjective quality.
arXiv Detail & Related papers (2021-11-17T02:43:45Z) - Optical-Flow-Reuse-Based Bidirectional Recurrent Network for Space-Time
Video Super-Resolution [52.899234731501075]
Space-time video super-resolution (ST-VSR) simultaneously increases the spatial resolution and frame rate for a given video.
Existing methods typically suffer from difficulties in how to efficiently leverage information from a large range of neighboring frames.
We propose a coarse-to-fine bidirectional recurrent neural network instead of using ConvLSTM to leverage knowledge between adjacent frames.
arXiv Detail & Related papers (2021-10-13T15:21:30Z) - PDWN: Pyramid Deformable Warping Network for Video Interpolation [11.62213584807003]
We propose a light but effective model, called Pyramid Deformable Warping Network (PDWN)
PDWN uses a pyramid structure to generate DConv offsets of the unknown middle frame with respect to the known frames through coarse-to-fine successive refinements.
Our method achieves better or on-par accuracy compared to state-of-the-art models on multiple datasets.
arXiv Detail & Related papers (2021-04-04T02:08:57Z) - FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation [97.99012124785177]
FLAVR is a flexible and efficient architecture that uses 3D space-time convolutions to enable end-to-end learning and inference for video framesupervised.
We demonstrate that FLAVR can serve as a useful self- pretext task for action recognition, optical flow estimation, and motion magnification.
arXiv Detail & Related papers (2020-12-15T18:59:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.