Overfitting the Data: Compact Neural Video Delivery via Content-aware
Feature Modulation
- URL: http://arxiv.org/abs/2108.08202v1
- Date: Wed, 18 Aug 2021 15:34:11 GMT
- Title: Overfitting the Data: Compact Neural Video Delivery via Content-aware
Feature Modulation
- Authors: Jiaming Liu, Ming Lu, Kaixin Chen, Xiaoqi Li, Shizun Wang, Zhaoqing
Wang, Enhua Wu, Yurong Chen, Chuang Zhang, Ming Wu
- Abstract summary: Methods divide a video into chunks, and stream LR video chunks and corresponding content-aware models to the client.
With our method, each video chunk only requires less than $1% $ of original parameters to be streamed, achieving even better SR performance.
- Score: 38.889823516049056
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Internet video delivery has undergone a tremendous explosion of growth over
the past few years. However, the quality of video delivery system greatly
depends on the Internet bandwidth. Deep Neural Networks (DNNs) are utilized to
improve the quality of video delivery recently. These methods divide a video
into chunks, and stream LR video chunks and corresponding content-aware models
to the client. The client runs the inference of models to super-resolve the LR
chunks. Consequently, a large number of models are streamed in order to deliver
a video. In this paper, we first carefully study the relation between models of
different chunks, then we tactfully design a joint training framework along
with the Content-aware Feature Modulation (CaFM) layer to compress these models
for neural video delivery. {\bf With our method, each video chunk only requires
less than $1\% $ of original parameters to be streamed, achieving even better
SR performance.} We conduct extensive experiments across various SR backbones,
video time length, and scaling factors to demonstrate the advantages of our
method. Besides, our method can be also viewed as a new approach of video
coding. Our primary experiments achieve better video quality compared with the
commercial H.264 and H.265 standard under the same storage cost, showing the
great potential of the proposed method. Code is available
at:\url{https://github.com/Neural-video-delivery/CaFM-Pytorch-ICCV2021}
Related papers
- Data Overfitting for On-Device Super-Resolution with Dynamic Algorithm and Compiler Co-Design [18.57172631588624]
We propose a Dynamic Deep neural network assisted by a Content-Aware data processing pipeline to reduce the model number down to one.
Our method achieves better PSNR and real-time performance (33 FPS) on an off-the-shelf mobile phone.
arXiv Detail & Related papers (2024-07-03T05:17:26Z) - SF-V: Single Forward Video Generation Model [57.292575082410785]
We propose a novel approach to obtain single-step video generation models by leveraging adversarial training to fine-tune pre-trained models.
Experiments demonstrate that our method achieves competitive generation quality of synthesized videos with significantly reduced computational overhead.
arXiv Detail & Related papers (2024-06-06T17:58:27Z) - Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition [124.41196697408627]
We propose content-motion latent diffusion model (CMD), a novel efficient extension of pretrained image diffusion models for video generation.
CMD encodes a video as a combination of a content frame (like an image) and a low-dimensional motion latent representation.
We generate the content frame by fine-tuning a pretrained image diffusion model, and we generate the motion latent representation by training a new lightweight diffusion model.
arXiv Detail & Related papers (2024-03-21T05:48:48Z) - Reinforcement Learning -based Adaptation and Scheduling Methods for
Multi-source DASH [1.1971219484941955]
Dynamic adaptive streaming over HTTP (DASH) has been widely used in video streaming recently.
In multi-source streaming, video chunks may arrive out of order due to different conditions of the network paths.
This paper proposes two algorithms for streaming from multiple sources: RL-based adaptation with greedy scheduling (RLAGS) and RL-based adaptation and scheduling (RLAS)
arXiv Detail & Related papers (2023-07-25T06:47:12Z) - HNeRV: A Hybrid Neural Representation for Videos [56.492309149698606]
Implicit neural representations store videos as neural networks.
We propose a Hybrid Neural Representation for Videos (HNeRV)
With content-adaptive embeddings and re-designed architecture, HNeRV outperforms implicit methods in video regression tasks.
arXiv Detail & Related papers (2023-04-05T17:55:04Z) - Towards Scalable Neural Representation for Diverse Videos [68.73612099741956]
Implicit neural representations (INR) have gained increasing attention in representing 3D scenes and images.
Existing INR-based methods are limited to encoding a handful of short videos with redundant visual content.
This paper focuses on developing neural representations for encoding long and/or a large number of videos with diverse visual content.
arXiv Detail & Related papers (2023-03-24T16:32:19Z) - Towards High-Quality and Efficient Video Super-Resolution via
Spatial-Temporal Data Overfitting [27.302681897961588]
Deep convolutional neural networks (DNNs) are widely used in various fields of computer vision.
We propose a novel method for high-quality and efficient video resolution upscaling tasks.
We deploy our models on an off-the-shelf mobile phone, and experimental results show that our method achieves real-time video super-resolution with high video quality.
arXiv Detail & Related papers (2023-03-15T02:40:02Z) - MagicVideo: Efficient Video Generation With Latent Diffusion Models [76.95903791630624]
We present an efficient text-to-video generation framework based on latent diffusion models, termed MagicVideo.
Due to a novel and efficient 3D U-Net design and modeling video distributions in a low-dimensional space, MagicVideo can synthesize video clips with 256x256 spatial resolution on a single GPU card.
We conduct extensive experiments and demonstrate that MagicVideo can generate high-quality video clips with either realistic or imaginary content.
arXiv Detail & Related papers (2022-11-20T16:40:31Z) - Efficient Meta-Tuning for Content-aware Neural Video Delivery [40.3731358963689]
We present Efficient Meta-Tuning (EMT) to reduce the computational cost.
EMT adapts a meta-learned model to the first chunk of the input video.
We propose a novel sampling strategy to extract the most challenging patches from video frames.
arXiv Detail & Related papers (2022-07-20T06:47:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.