Spatial-Temporal Residual Aggregation for High Resolution Video
Inpainting
- URL: http://arxiv.org/abs/2111.03574v1
- Date: Fri, 5 Nov 2021 15:50:31 GMT
- Title: Spatial-Temporal Residual Aggregation for High Resolution Video
Inpainting
- Authors: Vishnu Sanjay Ramiya Srinivasan, Rui Ma, Qiang Tang, Zili Yi, Zhan Xu
- Abstract summary: Recent learning-based inpainting algorithms have achieved compelling results for completing missing regions after removing undesired objects in videos.
We propose STRA-Net, a novel spatial-temporal residual aggregation framework for high resolution video inpainting.
Both the quantitative and qualitative evaluations show that we can produce more temporal-coherent and visually appealing results than the state-of-the-art methods on inpainting high resolution videos.
- Score: 14.035620730770528
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent learning-based inpainting algorithms have achieved compelling results
for completing missing regions after removing undesired objects in videos. To
maintain the temporal consistency among the frames, 3D spatial and temporal
operations are often heavily used in the deep networks. However, these methods
usually suffer from memory constraints and can only handle low resolution
videos. We propose STRA-Net, a novel spatial-temporal residual aggregation
framework for high resolution video inpainting. The key idea is to first learn
and apply a spatial and temporal inpainting network on the downsampled low
resolution videos. Then, we refine the low resolution results by aggregating
the learned spatial and temporal image residuals (details) to the upsampled
inpainted frames. Both the quantitative and qualitative evaluations show that
we can produce more temporal-coherent and visually appealing results than the
state-of-the-art methods on inpainting high resolution videos.
Related papers
- Towards Interpretable Video Super-Resolution via Alternating
Optimization [115.85296325037565]
We study a practical space-time video super-resolution (STVSR) problem which aims at generating a high-framerate high-resolution sharp video from a low-framerate blurry video.
We propose an interpretable STVSR framework by leveraging both model-based and learning-based methods.
arXiv Detail & Related papers (2022-07-21T21:34:05Z) - Feature Refinement to Improve High Resolution Image Inpainting [1.4824891788575418]
Inpainting networks are often unable to generate globally coherent structures at resolutions higher than their training set.
We optimize the intermediate featuremaps of a network by minimizing a multiscale consistency loss at inference.
This runtime optimization improves the inpainting results and establishes a new state-of-the-art for high resolution inpainting.
arXiv Detail & Related papers (2022-06-27T21:59:12Z) - VideoINR: Learning Video Implicit Neural Representation for Continuous
Space-Time Super-Resolution [75.79379734567604]
We show that Video Implicit Neural Representation (VideoINR) can be decoded to videos of arbitrary spatial resolution and frame rate.
We show that VideoINR achieves competitive performances with state-of-the-art STVSR methods on common up-sampling scales.
arXiv Detail & Related papers (2022-06-09T17:45:49Z) - Learning Spatio-Temporal Downsampling for Effective Video Upscaling [20.07194339353278]
In this paper, we aim to solve the space-time aliasing problem by learning a-temporal downsampling and upsampling.
Our framework enables a variety of applications, including arbitrary video resampling, blurry frame reconstruction, and efficient video storage.
arXiv Detail & Related papers (2022-03-15T17:59:00Z) - Video Salient Object Detection via Contrastive Features and Attention
Modules [106.33219760012048]
We propose a network with attention modules to learn contrastive features for video salient object detection.
A co-attention formulation is utilized to combine the low-level and high-level features.
We show that the proposed method requires less computation, and performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-03T17:40:32Z) - Coarse-Fine Networks for Temporal Activity Detection in Videos [45.03545172714305]
We introduce 'Co-Fine Networks', a two-stream architecture which benefits from different abstractions of temporal resolution to learn better video representations for long-term motion.
We show that our method can outperform the state-of-the-arts for action detection in public datasets with a significantly reduced compute and memory footprint.
arXiv Detail & Related papers (2021-03-01T20:48:01Z) - Short-Term and Long-Term Context Aggregation Network for Video
Inpainting [126.06302824297948]
Video inpainting aims to restore missing regions of a video and has many applications such as video editing and object removal.
We present a novel context aggregation network to effectively exploit both short-term and long-term frame information for video inpainting.
Experiments show that it outperforms state-of-the-art methods with better inpainting results and fast inpainting speed.
arXiv Detail & Related papers (2020-09-12T03:50:56Z) - Neural Sparse Voxel Fields [151.20366604586403]
We introduce Neural Sparse Voxel Fields (NSVF), a new neural scene representation for fast and high-quality free-viewpoint rendering.
NSVF defines a set of voxel-bounded implicit fields organized in a sparse voxel octree to model local properties in each cell.
Our method is typically over 10 times faster than the state-of-the-art (namely, NeRF(Mildenhall et al., 2020)) at inference time while achieving higher quality results.
arXiv Detail & Related papers (2020-07-22T17:51:31Z) - Learning Joint Spatial-Temporal Transformations for Video Inpainting [58.939131620135235]
We propose to learn a joint Spatial-Temporal Transformer Network (STTN) for video inpainting.
We simultaneously fill missing regions in all input frames by self-attention, and propose to optimize STTN by a spatial-temporal adversarial loss.
arXiv Detail & Related papers (2020-07-20T16:35:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.