PS-NeRV: Patch-wise Stylized Neural Representations for Videos
- URL: http://arxiv.org/abs/2208.03742v1
- Date: Sun, 7 Aug 2022 14:45:30 GMT
- Title: PS-NeRV: Patch-wise Stylized Neural Representations for Videos
- Authors: Yunpeng Bai, Chao Dong, Cairong Wang
- Abstract summary: PS-NeRV represents videos as a function of patches and the corresponding patch coordinate.
It naturally inherits the advantages of image-wise methods, and achieves excellent reconstruction performance with fast decoding speed.
- Score: 13.14511356472246
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study how to represent a video with implicit neural representations
(INRs). Classical INRs methods generally utilize MLPs to map input coordinates
to output pixels. While some recent works have tried to directly reconstruct
the whole image with CNNs. However, we argue that both the above pixel-wise and
image-wise strategies are not favorable to video data. Instead, we propose a
patch-wise solution, PS-NeRV, which represents videos as a function of patches
and the corresponding patch coordinate. It naturally inherits the advantages of
image-wise methods, and achieves excellent reconstruction performance with fast
decoding speed. The whole method includes conventional modules, like positional
embedding, MLPs and CNNs, while also introduces AdaIN to enhance intermediate
features. These simple yet essential changes could help the network easily fit
high-frequency details. Extensive experiments have demonstrated its
effectiveness in several video-related tasks, such as video compression and
video inpainting.
Related papers
- Progressive Fourier Neural Representation for Sequential Video
Compilation [75.43041679717376]
Motivated by continual learning, this work investigates how to accumulate and transfer neural implicit representations for multiple complex video data over sequential encoding sessions.
We propose a novel method, Progressive Fourier Neural Representation (PFNR), that aims to find an adaptive and compact sub-module in Fourier space to encode videos in each training session.
We validate our PFNR method on the UVG8/17 and DAVIS50 video sequence benchmarks and achieve impressive performance gains over strong continual learning baselines.
arXiv Detail & Related papers (2023-06-20T06:02:19Z) - T-former: An Efficient Transformer for Image Inpainting [50.43302925662507]
A class of attention-based network architectures, called transformer, has shown significant performance on natural language processing fields.
In this paper, we design a novel attention linearly related to the resolution according to Taylor expansion, and based on this attention, a network called $T$-former is designed for image inpainting.
Experiments on several benchmark datasets demonstrate that our proposed method achieves state-of-the-art accuracy while maintaining a relatively low number of parameters and computational complexity.
arXiv Detail & Related papers (2023-05-12T04:10:42Z) - HNeRV: A Hybrid Neural Representation for Videos [56.492309149698606]
Implicit neural representations store videos as neural networks.
We propose a Hybrid Neural Representation for Videos (HNeRV)
With content-adaptive embeddings and re-designed architecture, HNeRV outperforms implicit methods in video regression tasks.
arXiv Detail & Related papers (2023-04-05T17:55:04Z) - Towards Scalable Neural Representation for Diverse Videos [68.73612099741956]
Implicit neural representations (INR) have gained increasing attention in representing 3D scenes and images.
Existing INR-based methods are limited to encoding a handful of short videos with redundant visual content.
This paper focuses on developing neural representations for encoding long and/or a large number of videos with diverse visual content.
arXiv Detail & Related papers (2023-03-24T16:32:19Z) - FFNeRV: Flow-Guided Frame-Wise Neural Representations for Videos [5.958701846880935]
We propose FFNeRV, a novel method for incorporating flow information into frame-wise representations to exploit the temporal redundancy across the frames in videos.
With model compression techniques, FFNeRV outperforms widely-used standard video codecs (H.264 and HEVC) and performs on par with state-of-the-art video compression algorithms.
arXiv Detail & Related papers (2022-12-23T12:51:42Z) - Scalable Neural Video Representations with Learnable Positional Features [73.51591757726493]
We show how to train neural representations with learnable positional features (NVP) that effectively amortize a video as latent codes.
We demonstrate the superiority of NVP on the popular UVG benchmark; compared with prior arts, NVP not only trains 2 times faster (less than 5 minutes) but also exceeds their encoding quality as 34.07rightarrow$34.57 (measured with the PSNR metric)
arXiv Detail & Related papers (2022-10-13T08:15:08Z) - NeRV: Neural Representations for Videos [36.00198388959609]
We propose a novel neural representation for videos (NeRV) which encodes videos in neural networks.
NeRV is simply fitting a neural network to video frames and decoding process is a simple feedforward operation.
With such a representation, we can treat videos as neural networks, simplifying several video-related tasks.
arXiv Detail & Related papers (2021-10-26T17:56:23Z) - COIN: COmpression with Implicit Neural representations [64.02694714768691]
We propose a new simple approach for image compression.
Instead of storing the RGB values for each pixel of an image, we store the weights of a neural network overfitted to the image.
arXiv Detail & Related papers (2021-03-03T10:58:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.