Content Adaptive based Motion Alignment Framework for Learned Video Compression
- URL: http://arxiv.org/abs/2512.12936v1
- Date: Mon, 15 Dec 2025 02:51:47 GMT
- Title: Content Adaptive based Motion Alignment Framework for Learned Video Compression
- Authors: Tiange Zhang, Xiandong Meng, Siwei Ma,
- Abstract summary: This paper proposes a content adaptive based motion alignment framework.<n>We first introduce a two-stage flow-guided deformable warping mechanism that refines motion compensation with coarse-to-fine offset prediction and mask modulation.<n>Second, we propose a multi-reference quality aware strategy that adjusts distortion weights based on reference quality, and applies it to hierarchical training to reduce error propagation.<n>Third, we integrate a training-free module that downsamples frames by motion magnitude and resolution to obtain smooth motion estimation.
- Score: 72.13599533975413
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in end-to-end video compression have shown promising results owing to their unified end-to-end learning optimization. However, such generalized frameworks often lack content-specific adaptation, leading to suboptimal compression performance. To address this, this paper proposes a content adaptive based motion alignment framework that improves performance by adapting encoding strategies to diverse content characteristics. Specifically, we first introduce a two-stage flow-guided deformable warping mechanism that refines motion compensation with coarse-to-fine offset prediction and mask modulation, enabling precise feature alignment. Second, we propose a multi-reference quality aware strategy that adjusts distortion weights based on reference quality, and applies it to hierarchical training to reduce error propagation. Third, we integrate a training-free module that downsamples frames by motion magnitude and resolution to obtain smooth motion estimation. Experimental results on standard test datasets demonstrate that our framework CAMA achieves significant improvements over state-of-the-art Neural Video Compression models, achieving a 24.95% BD-rate (PSNR) savings over our baseline model DCVC-TCM, while also outperforming reproduced DCVC-DC and traditional codec HM-16.25.
Related papers
- Error-Propagation-Free Learned Video Compression With Dual-Domain Progressive Temporal Alignment [92.57576987521107]
We propose a novel unifiedtransform framework with dual-domain progressive temporal alignment and quality-conditioned mixture-of-expert (QCMoE)<n>QCMoE allows continuous and consistent rate control with appealing R-D performance.<n> Experimental results show that the proposed method achieves competitive R-D performance compared with the state-of-the-arts.
arXiv Detail & Related papers (2025-12-11T09:14:51Z) - Residual Learning and Filtering Networks for End-to-End Lossless Video Compression [3.0770091134672586]
Existing learning-based video compression methods face challenges related to inaccurate motion estimates and inadequate motion compensation structures.<n>This work presents an end-to-end video compression method that incorporates several key operations.<n>The proposed approach tackles the challenges of accurate motion estimation and motion compensation in video compression.
arXiv Detail & Related papers (2025-03-11T18:51:36Z) - Prediction and Reference Quality Adaptation for Learned Video Compression [54.58691829087094]
Temporal prediction is one of the most important technologies for video compression.<n>Traditional video codecs adaptively decide the optimal coding mode according to the prediction quality and reference quality.<n>We propose a confidence-based prediction quality adaptation (PQA) module and a reference quality adaptation (RQA) module.
arXiv Detail & Related papers (2024-06-20T09:03:26Z) - Compression-Realized Deep Structural Network for Video Quality Enhancement [78.13020206633524]
This paper focuses on the task of quality enhancement for compressed videos.
Most of the existing methods lack a structured design to optimally leverage the priors within compression codecs.
A new paradigm is urgently needed for a more conscious'' process of quality enhancement.
arXiv Detail & Related papers (2024-05-10T09:18:17Z) - Multi-Scale Deformable Alignment and Content-Adaptive Inference for
Flexible-Rate Bi-Directional Video Compression [8.80688035831646]
This paper proposes an adaptive motion-compensation model for end-to-end rate-distortion optimized hierarchical bi-directional video compression.
We employ a gain unit, which enables a single model to operate at multiple rate-distortion operating points.
Experimental results demonstrate state-of-the-art rate-distortion performance exceeding those of all prior art in learned video coding.
arXiv Detail & Related papers (2023-06-28T20:32:16Z) - Boost Video Frame Interpolation via Motion Adaptation [73.42573856943923]
Video frame (VFI) is a challenging task that aims to generate intermediate frames between two consecutive frames in a video.
Existing learning-based VFI methods have achieved great success, but they still suffer from limited generalization ability.
We propose a novel optimization-based VFI method that can adapt to unseen motions at test time.
arXiv Detail & Related papers (2023-06-24T10:44:02Z) - Content-Adaptive Motion Rate Adaption for Learned Video Compression [11.574465203875342]
This paper introduces an online motion rate adaptation scheme for learned video compression.
It aims to achieve content-adaptive coding on individual test sequences to mitigate the domain gap between training and test data.
It features a patch-level bit allocation map, termed the $alpha$-map, to trade off between the bit rates for motion and inter-frame coding.
arXiv Detail & Related papers (2023-02-13T11:51:23Z) - Learned Video Compression via Heterogeneous Deformable Compensation
Network [78.72508633457392]
We propose a learned video compression framework via heterogeneous deformable compensation strategy (HDCVC) to tackle the problems of unstable compression performance.
More specifically, the proposed algorithm extracts features from the two adjacent frames to estimate content-Neighborhood heterogeneous deformable (HetDeform) kernel offsets.
Experimental results indicate that HDCVC achieves superior performance than the recent state-of-the-art learned video compression approaches.
arXiv Detail & Related papers (2022-07-11T02:31:31Z) - Flexible-Rate Learned Hierarchical Bi-Directional Video Compression With
Motion Refinement and Frame-Level Bit Allocation [8.80688035831646]
We combine motion estimation and prediction modules and compress refined residual motion vectors for improved rate-distortion performance.
We exploit the gain unit to control bit allocation among intra-coded vs. bi-directionally coded frames.
arXiv Detail & Related papers (2022-06-27T20:18:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.