Learned Video Compression via Heterogeneous Deformable Compensation
Network
- URL: http://arxiv.org/abs/2207.04589v3
- Date: Thu, 29 Jun 2023 07:05:06 GMT
- Title: Learned Video Compression via Heterogeneous Deformable Compensation
Network
- Authors: Huairui Wang, Zhenzhong Chen, Chang Wen Chen
- Abstract summary: We propose a learned video compression framework via heterogeneous deformable compensation strategy (HDCVC) to tackle the problems of unstable compression performance.
More specifically, the proposed algorithm extracts features from the two adjacent frames to estimate content-Neighborhood heterogeneous deformable (HetDeform) kernel offsets.
Experimental results indicate that HDCVC achieves superior performance than the recent state-of-the-art learned video compression approaches.
- Score: 78.72508633457392
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learned video compression has recently emerged as an essential research topic
in developing advanced video compression technologies, where motion
compensation is considered one of the most challenging issues. In this paper,
we propose a learned video compression framework via heterogeneous deformable
compensation strategy (HDCVC) to tackle the problems of unstable compression
performance caused by single-size deformable kernels in downsampled feature
domain. More specifically, instead of utilizing optical flow warping or
single-size-kernel deformable alignment, the proposed algorithm extracts
features from the two adjacent frames to estimate content-adaptive
heterogeneous deformable (HetDeform) kernel offsets. Then we transform the
reference features with the HetDeform convolution to accomplish motion
compensation. Moreover, we design a Spatial-Neighborhood-Conditioned Divisive
Normalization (SNCDN) to achieve more effective data Gaussianization combined
with the Generalized Divisive Normalization. Furthermore, we propose a
multi-frame enhanced reconstruction module for exploiting context and temporal
information for final quality enhancement. Experimental results indicate that
HDCVC achieves superior performance than the recent state-of-the-art learned
video compression approaches.
Related papers
- Compression-Realized Deep Structural Network for Video Quality Enhancement [78.13020206633524]
This paper focuses on the task of quality enhancement for compressed videos.
Most of the existing methods lack a structured design to optimally leverage the priors within compression codecs.
A new paradigm is urgently needed for a more conscious'' process of quality enhancement.
arXiv Detail & Related papers (2024-05-10T09:18:17Z) - VCISR: Blind Single Image Super-Resolution with Video Compression
Synthetic Data [18.877077302923713]
We present a video compression-based degradation model to synthesize low-resolution image data in the blind SISR task.
Our proposed image synthesizing method is widely applicable to existing image datasets.
By introducing video coding artifacts to SISR degradation models, neural networks can super-resolve images with the ability to restore video compression degradations.
arXiv Detail & Related papers (2023-11-02T05:24:19Z) - IBVC: Interpolation-driven B-frame Video Compression [68.18440522300536]
B-frame video compression aims to adopt bi-directional motion estimation and motion compensation (MEMC) coding for middle frame reconstruction.
Previous learned approaches often directly extend neural P-frame codecs to B-frame relying on bi-directional optical-flow estimation.
We propose a simple yet effective structure called Interpolation-B-frame Video Compression (IBVC) to address these issues.
arXiv Detail & Related papers (2023-09-25T02:45:51Z) - Differentiable Resolution Compression and Alignment for Efficient Video
Classification and Retrieval [16.497758750494537]
We propose an efficient video representation network with Differentiable Resolution Compression and Alignment mechanism.
We leverage a Differentiable Context-aware Compression Module to encode the saliency and non-saliency frame features.
We introduce a new Resolution-Align Transformer Layer to capture global temporal correlations among frame features with different resolutions.
arXiv Detail & Related papers (2023-09-15T05:31:53Z) - Multi-Scale Deformable Alignment and Content-Adaptive Inference for
Flexible-Rate Bi-Directional Video Compression [8.80688035831646]
This paper proposes an adaptive motion-compensation model for end-to-end rate-distortion optimized hierarchical bi-directional video compression.
We employ a gain unit, which enables a single model to operate at multiple rate-distortion operating points.
Experimental results demonstrate state-of-the-art rate-distortion performance exceeding those of all prior art in learned video coding.
arXiv Detail & Related papers (2023-06-28T20:32:16Z) - Cross Modal Compression: Towards Human-comprehensible Semantic
Compression [73.89616626853913]
Cross modal compression is a semantic compression framework for visual data.
We show that our proposed CMC can achieve encouraging reconstructed results with an ultrahigh compression ratio.
arXiv Detail & Related papers (2022-09-06T15:31:11Z) - Leveraging Bitstream Metadata for Fast, Accurate, Generalized Compressed
Video Quality Enhancement [74.1052624663082]
We develop a deep learning architecture capable of restoring detail to compressed videos.
We show that this improves restoration accuracy compared to prior compression correction methods.
We condition our model on quantization data which is readily available in the bitstream.
arXiv Detail & Related papers (2022-01-31T18:56:04Z) - COMISR: Compression-Informed Video Super-Resolution [76.94152284740858]
Most videos on the web or mobile devices are compressed, and the compression can be severe when the bandwidth is limited.
We propose a new compression-informed video super-resolution model to restore high-resolution content without introducing artifacts caused by compression.
arXiv Detail & Related papers (2021-05-04T01:24:44Z) - Decomposition, Compression, and Synthesis (DCS)-based Video Coding: A
Neural Exploration via Resolution-Adaptive Learning [30.54722074562783]
We decompose the input video into respective spatial texture frames (STF) at its native spatial resolution.
Then, we compress them together using any popular video coder.
Finally, we synthesize decoded STFs and TMFs for high-quality video reconstruction at the same resolution as its native input.
arXiv Detail & Related papers (2020-12-01T17:23:53Z) - Feedback Recurrent Autoencoder for Video Compression [14.072596106425072]
We propose a new network architecture for learned video compression operating in low latency mode.
Our method yields state of the art MS-SSIM/rate performance on the high-resolution UVG dataset.
arXiv Detail & Related papers (2020-04-09T02:58:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.