Multi-Density Attention Network for Loop Filtering in Video Compression
- URL: http://arxiv.org/abs/2104.12865v1
- Date: Thu, 8 Apr 2021 05:46:38 GMT
- Title: Multi-Density Attention Network for Loop Filtering in Video Compression
- Authors: Zhao Wang, Changyue Ma, Yan Ye
- Abstract summary: We propose a on-line scaling based multi-density attention network for loop filtering in video compression.
Experimental results show that 10.18% bit-rate reduction at the same video quality can be achieved over the latest Versatile Video Coding (VVC) standard.
- Score: 9.322800480045336
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Video compression is a basic requirement for consumer and professional video
applications alike. Video coding standards such as H.264/AVC and H.265/HEVC are
widely deployed in the market to enable efficient use of bandwidth and storage
for many video applications. To reduce the coding artifacts and improve the
compression efficiency, neural network based loop filtering of the
reconstructed video has been developed in the literature. However, loop
filtering is a challenging task due to the variation in video content and
sampling densities. In this paper, we propose a on-line scaling based
multi-density attention network for loop filtering in video compression. The
core of our approach lies in several aspects: (a) parallel multi-resolution
convolution streams for extracting multi-density features, (b) single attention
branch to learn the sample correlations and generate mask maps, (c) a
channel-mutual attention procedure to fuse the data from multiple branches, (d)
on-line scaling technique to further optimize the output results of network
according to the actual signal. The proposed multi-density attention network
learns rich features from multiple sampling densities and performs robustly on
video content of different resolutions. Moreover, the online scaling process
enhances the signal adaptability of the off-line pre-trained model.
Experimental results show that 10.18% bit-rate reduction at the same video
quality can be achieved over the latest Versatile Video Coding (VVC) standard.
The objective performance of the proposed algorithm outperforms the
state-of-the-art methods and the subjective quality improvement is obvious in
terms of detail preservation and artifact alleviation.
Related papers
- CANeRV: Content Adaptive Neural Representation for Video Compression [89.35616046528624]
We propose Content Adaptive Neural Representation for Video Compression (CANeRV)
CANeRV is an innovative INR-based video compression network that adaptively conducts structure optimisation based on the specific content of each video sequence.
We show that CANeRV can outperform both H.266/VVC and state-of-the-art INR-based video compression techniques across diverse video datasets.
arXiv Detail & Related papers (2025-02-10T06:21:16Z) - RL-RC-DoT: A Block-level RL agent for Task-Aware Video Compression [68.31184784672227]
In modern applications such as autonomous driving, an overwhelming majority of videos serve as input for AI systems performing tasks.
It is therefore useful to optimize the encoder for a downstream task instead of for image quality.
Here, we address this challenge by controlling the Quantization Parameters (QPs) at the macro-block level to optimize the downstream task.
arXiv Detail & Related papers (2025-01-21T15:36:08Z) - When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding [118.72266141321647]
Cross-Modality Video Coding (CMVC) is a pioneering approach to explore multimodality representation and video generative models in video coding.
During decoding, previously encoded components and video generation models are leveraged to create multiple encoding-decoding modes.
Experiments indicate that TT2V achieves effective semantic reconstruction, while IT2V exhibits competitive perceptual consistency.
arXiv Detail & Related papers (2024-08-15T11:36:18Z) - Compression-Realized Deep Structural Network for Video Quality Enhancement [78.13020206633524]
This paper focuses on the task of quality enhancement for compressed videos.
Most of the existing methods lack a structured design to optimally leverage the priors within compression codecs.
A new paradigm is urgently needed for a more conscious'' process of quality enhancement.
arXiv Detail & Related papers (2024-05-10T09:18:17Z) - Differentiable Resolution Compression and Alignment for Efficient Video
Classification and Retrieval [16.497758750494537]
We propose an efficient video representation network with Differentiable Resolution Compression and Alignment mechanism.
We leverage a Differentiable Context-aware Compression Module to encode the saliency and non-saliency frame features.
We introduce a new Resolution-Align Transformer Layer to capture global temporal correlations among frame features with different resolutions.
arXiv Detail & Related papers (2023-09-15T05:31:53Z) - Video Compression with Arbitrary Rescaling Network [8.489428003916622]
We propose a rate-guided arbitrary rescaling network (RARN) for video resizing before encoding.
The lightweight RARN structure can process FHD (1080p) content at real-time speed (91 FPS) and obtain a considerable rate reduction.
arXiv Detail & Related papers (2023-06-07T07:15:18Z) - Efficient VVC Intra Prediction Based on Deep Feature Fusion and
Probability Estimation [57.66773945887832]
We propose to optimize Versatile Video Coding (VVC) complexity at intra-frame prediction, with a two-stage framework of deep feature fusion and probability estimation.
Experimental results on standard database demonstrate the superiority of proposed method, especially for High Definition (HD) and Ultra-HD (UHD) video sequences.
arXiv Detail & Related papers (2022-05-07T08:01:32Z) - Hybrid Contrastive Quantization for Efficient Cross-View Video Retrieval [55.088635195893325]
We propose the first quantized representation learning method for cross-view video retrieval, namely Hybrid Contrastive Quantization (HCQ)
HCQ learns both coarse-grained and fine-grained quantizations with transformers, which provide complementary understandings for texts and videos.
Experiments on three Web video benchmark datasets demonstrate that HCQ achieves competitive performance with state-of-the-art non-compressed retrieval methods.
arXiv Detail & Related papers (2022-02-07T18:04:10Z) - Multitask Learning for VVC Quality Enhancement and Super-Resolution [11.446576112498596]
We propose a learning-based solution as a post-processing step to enhance the decoded VVC video quality.
Our method relies on multitask learning to perform both quality enhancement and super-resolution using a single shared network optimized for multiple levels.
arXiv Detail & Related papers (2021-04-16T19:05:26Z) - Super-Resolving Compressed Video in Coding Chain [27.994055823226848]
We present a mixed-resolution coding framework, which cooperates with a reference-based DCNN.
In this novel coding chain, the reference-based DCNN learns the direct mapping from low-resolution (LR) compressed video to their high-resolution (HR) clean version at the decoder side.
arXiv Detail & Related papers (2021-03-26T03:39:54Z) - Efficient Adaptation of Neural Network Filter for Video Compression [10.769305738505071]
We present an efficient finetuning methodology for neural-network filters.
The fine-tuning is performed at encoder side to adapt the neural network to the specific content that is being encoded.
The proposed method achieves much faster than conventional finetuning approaches.
arXiv Detail & Related papers (2020-07-28T14:24:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.