Towards An End-to-End Framework for Flow-Guided Video Inpainting
- URL: http://arxiv.org/abs/2204.02663v2
- Date: Thu, 7 Apr 2022 13:35:40 GMT
- Title: Towards An End-to-End Framework for Flow-Guided Video Inpainting
- Authors: Zhen Li, Cheng-Ze Lu, Jianhua Qin, Chun-Le Guo, Ming-Ming Cheng
- Abstract summary: We propose an End-to-End framework for Flow-Guided Video Inpainting (E$2$FGVI)
The proposed method outperforms state-of-the-art methods both qualitatively and quantitatively.
- Score: 68.71844500391023
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Optical flow, which captures motion information across frames, is exploited
in recent video inpainting methods through propagating pixels along its
trajectories. However, the hand-crafted flow-based processes in these methods
are applied separately to form the whole inpainting pipeline. Thus, these
methods are less efficient and rely heavily on the intermediate results from
earlier stages. In this paper, we propose an End-to-End framework for
Flow-Guided Video Inpainting (E$^2$FGVI) through elaborately designed three
trainable modules, namely, flow completion, feature propagation, and content
hallucination modules. The three modules correspond with the three stages of
previous flow-based methods but can be jointly optimized, leading to a more
efficient and effective inpainting process. Experimental results demonstrate
that the proposed method outperforms state-of-the-art methods both
qualitatively and quantitatively and shows promising efficiency. The code is
available at https://github.com/MCG-NKU/E2FGVI.
Related papers
- OrientDream: Streamlining Text-to-3D Generation with Explicit Orientation Control [66.03885917320189]
OrientDream is a camera orientation conditioned framework for efficient and multi-view consistent 3D generation from textual prompts.
Our strategy emphasizes the implementation of an explicit camera orientation conditioned feature in the pre-training of a 2D text-to-image diffusion module.
Our experiments reveal that our method not only produces high-quality NeRF models with consistent multi-view properties but also achieves an optimization speed significantly greater than existing methods.
arXiv Detail & Related papers (2024-06-14T13:16:18Z) - Motion-Aware Video Frame Interpolation [49.49668436390514]
We introduce a Motion-Aware Video Frame Interpolation (MA-VFI) network, which directly estimates intermediate optical flow from consecutive frames.
It not only extracts global semantic relationships and spatial details from input frames with different receptive fields, but also effectively reduces the required computational cost and complexity.
arXiv Detail & Related papers (2024-02-05T11:00:14Z) - Online Lane Graph Extraction from Onboard Video [133.68032636906133]
We use the video stream from an onboard camera for online extraction of the surrounding's lane graph.
Using video, instead of a single image, as input poses both benefits and challenges in terms of combining the information from different timesteps.
A single model of this proposed simple, yet effective, method can process any number of images, including one, to produce accurate lane graphs.
arXiv Detail & Related papers (2023-04-03T12:36:39Z) - Error Compensation Framework for Flow-Guided Video Inpainting [36.626793485786095]
We propose an Error Compensation Framework for Flow-guided Video Inpainting (ECFVI)
Our approach greatly improves the temporal consistency and the visual quality of the completed videos.
arXiv Detail & Related papers (2022-07-21T10:02:57Z) - Video Frame Interpolation via Structure-Motion based Iterative Fusion [19.499969588931414]
We propose a structure-motion based iterative fusion method for video frame Interpolation.
Inspired by the observation that audiences have different visual preferences on foreground and background objects, we for the first time propose to use saliency masks in the evaluation processes of the task of video frame Interpolation.
arXiv Detail & Related papers (2021-05-11T22:11:17Z) - FineNet: Frame Interpolation and Enhancement for Face Video Deblurring [18.49184807837449]
The aim of this work is to deblur face videos.
We propose a method that tackles this problem from two directions: (1) enhancing the blurry frames, and (2) treating the blurry frames as missing values and estimate them by objective.
Experiments on three real and synthetically generated video datasets show that our method outperforms the previous state-of-the-art methods by a large margin in terms of both quantitative and qualitative results.
arXiv Detail & Related papers (2021-03-01T09:47:16Z) - FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation [97.99012124785177]
FLAVR is a flexible and efficient architecture that uses 3D space-time convolutions to enable end-to-end learning and inference for video framesupervised.
We demonstrate that FLAVR can serve as a useful self- pretext task for action recognition, optical flow estimation, and motion magnification.
arXiv Detail & Related papers (2020-12-15T18:59:30Z) - Flow-edge Guided Video Completion [66.49077223104533]
Previous flow completion methods are often unable to retain the sharpness of motion boundaries.
Our method first extracts and completes motion edges, and then uses them to guide piecewise-smooth flow completion with sharp edges.
arXiv Detail & Related papers (2020-09-03T17:59:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.