Multi-Density Attention Network for Loop Filtering in Video Compression
- URL: http://arxiv.org/abs/2104.12865v1
- Date: Thu, 8 Apr 2021 05:46:38 GMT
- Title: Multi-Density Attention Network for Loop Filtering in Video Compression
- Authors: Zhao Wang, Changyue Ma, Yan Ye
- Abstract summary: We propose a on-line scaling based multi-density attention network for loop filtering in video compression.
Experimental results show that 10.18% bit-rate reduction at the same video quality can be achieved over the latest Versatile Video Coding (VVC) standard.
- Score: 9.322800480045336
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Video compression is a basic requirement for consumer and professional video
applications alike. Video coding standards such as H.264/AVC and H.265/HEVC are
widely deployed in the market to enable efficient use of bandwidth and storage
for many video applications. To reduce the coding artifacts and improve the
compression efficiency, neural network based loop filtering of the
reconstructed video has been developed in the literature. However, loop
filtering is a challenging task due to the variation in video content and
sampling densities. In this paper, we propose a on-line scaling based
multi-density attention network for loop filtering in video compression. The
core of our approach lies in several aspects: (a) parallel multi-resolution
convolution streams for extracting multi-density features, (b) single attention
branch to learn the sample correlations and generate mask maps, (c) a
channel-mutual attention procedure to fuse the data from multiple branches, (d)
on-line scaling technique to further optimize the output results of network
according to the actual signal. The proposed multi-density attention network
learns rich features from multiple sampling densities and performs robustly on
video content of different resolutions. Moreover, the online scaling process
enhances the signal adaptability of the off-line pre-trained model.
Experimental results show that 10.18% bit-rate reduction at the same video
quality can be achieved over the latest Versatile Video Coding (VVC) standard.
The objective performance of the proposed algorithm outperforms the
state-of-the-art methods and the subjective quality improvement is obvious in
terms of detail preservation and artifact alleviation.
Related papers
- When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding [112.44822009714461]
Cross-Modality Video Coding (CMVC) is a pioneering approach to explore multimodality representation and video generative models in video coding.
During decoding, previously encoded components and video generation models are leveraged to create multiple encoding-decoding modes.
Experiments indicate that TT2V achieves effective semantic reconstruction, while IT2V exhibits competitive perceptual consistency.
arXiv Detail & Related papers (2024-08-15T11:36:18Z) - Compression-Realized Deep Structural Network for Video Quality Enhancement [78.13020206633524]
This paper focuses on the task of quality enhancement for compressed videos.
Most of the existing methods lack a structured design to optimally leverage the priors within compression codecs.
A new paradigm is urgently needed for a more conscious'' process of quality enhancement.
arXiv Detail & Related papers (2024-05-10T09:18:17Z) - Differentiable Resolution Compression and Alignment for Efficient Video
Classification and Retrieval [16.497758750494537]
We propose an efficient video representation network with Differentiable Resolution Compression and Alignment mechanism.
We leverage a Differentiable Context-aware Compression Module to encode the saliency and non-saliency frame features.
We introduce a new Resolution-Align Transformer Layer to capture global temporal correlations among frame features with different resolutions.
arXiv Detail & Related papers (2023-09-15T05:31:53Z) - Video Compression with Arbitrary Rescaling Network [8.489428003916622]
We propose a rate-guided arbitrary rescaling network (RARN) for video resizing before encoding.
The lightweight RARN structure can process FHD (1080p) content at real-time speed (91 FPS) and obtain a considerable rate reduction.
arXiv Detail & Related papers (2023-06-07T07:15:18Z) - Towards Scalable Neural Representation for Diverse Videos [68.73612099741956]
Implicit neural representations (INR) have gained increasing attention in representing 3D scenes and images.
Existing INR-based methods are limited to encoding a handful of short videos with redundant visual content.
This paper focuses on developing neural representations for encoding long and/or a large number of videos with diverse visual content.
arXiv Detail & Related papers (2023-03-24T16:32:19Z) - Efficient VVC Intra Prediction Based on Deep Feature Fusion and
Probability Estimation [57.66773945887832]
We propose to optimize Versatile Video Coding (VVC) complexity at intra-frame prediction, with a two-stage framework of deep feature fusion and probability estimation.
Experimental results on standard database demonstrate the superiority of proposed method, especially for High Definition (HD) and Ultra-HD (UHD) video sequences.
arXiv Detail & Related papers (2022-05-07T08:01:32Z) - Hybrid Contrastive Quantization for Efficient Cross-View Video Retrieval [55.088635195893325]
We propose the first quantized representation learning method for cross-view video retrieval, namely Hybrid Contrastive Quantization (HCQ)
HCQ learns both coarse-grained and fine-grained quantizations with transformers, which provide complementary understandings for texts and videos.
Experiments on three Web video benchmark datasets demonstrate that HCQ achieves competitive performance with state-of-the-art non-compressed retrieval methods.
arXiv Detail & Related papers (2022-02-07T18:04:10Z) - Multitask Learning for VVC Quality Enhancement and Super-Resolution [11.446576112498596]
We propose a learning-based solution as a post-processing step to enhance the decoded VVC video quality.
Our method relies on multitask learning to perform both quality enhancement and super-resolution using a single shared network optimized for multiple levels.
arXiv Detail & Related papers (2021-04-16T19:05:26Z) - Super-Resolving Compressed Video in Coding Chain [27.994055823226848]
We present a mixed-resolution coding framework, which cooperates with a reference-based DCNN.
In this novel coding chain, the reference-based DCNN learns the direct mapping from low-resolution (LR) compressed video to their high-resolution (HR) clean version at the decoder side.
arXiv Detail & Related papers (2021-03-26T03:39:54Z) - Efficient Adaptation of Neural Network Filter for Video Compression [10.769305738505071]
We present an efficient finetuning methodology for neural-network filters.
The fine-tuning is performed at encoder side to adapt the neural network to the specific content that is being encoded.
The proposed method achieves much faster than conventional finetuning approaches.
arXiv Detail & Related papers (2020-07-28T14:24:28Z) - An Emerging Coding Paradigm VCM: A Scalable Coding Approach Beyond
Feature and Signal [99.49099501559652]
Video Coding for Machine (VCM) aims to bridge the gap between visual feature compression and classical video coding.
We employ a conditional deep generation network to reconstruct video frames with the guidance of learned motion pattern.
By learning to extract sparse motion pattern via a predictive model, the network elegantly leverages the feature representation to generate the appearance of to-be-coded frames.
arXiv Detail & Related papers (2020-01-09T14:18:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.