High-Frequency Enhanced Hybrid Neural Representation for Video Compression
- URL: http://arxiv.org/abs/2411.06685v1
- Date: Mon, 11 Nov 2024 03:04:46 GMT
- Title: High-Frequency Enhanced Hybrid Neural Representation for Video Compression
- Authors: Li Yu, Zhihui Li, Jimin Xiao, Moncef Gabbouj,
- Abstract summary: This paper introduces a High-Frequency Enhanced Hybrid Neural Representation Network.
Our method focuses on leveraging high-frequency information to improve the synthesis of fine details by the network.
Experiments on the Bunny and UVG datasets demonstrate that our method outperforms other methods.
- Score: 32.38933743785333
- License:
- Abstract: Neural Representations for Videos (NeRV) have simplified the video codec process and achieved swift decoding speeds by encoding video content into a neural network, presenting a promising solution for video compression. However, existing work overlooks the crucial issue that videos reconstructed by these methods lack high-frequency details. To address this problem, this paper introduces a High-Frequency Enhanced Hybrid Neural Representation Network. Our method focuses on leveraging high-frequency information to improve the synthesis of fine details by the network. Specifically, we design a wavelet high-frequency encoder that incorporates Wavelet Frequency Decomposer (WFD) blocks to generate high-frequency feature embeddings. Next, we design the High-Frequency Feature Modulation (HFM) block, which leverages the extracted high-frequency embeddings to enhance the fitting process of the decoder. Finally, with the refined Harmonic decoder block and a Dynamic Weighted Frequency Loss, we further reduce the potential loss of high-frequency information. Experiments on the Bunny and UVG datasets demonstrate that our method outperforms other methods, showing notable improvements in detail preservation and compression performance.
Related papers
- Neural Video Representation for Redundancy Reduction and Consistency Preservation [0.0]
Implicit neural representation (INR) embed various signals into neural networks.
We propose a video representation method that generates both the high-frequency and low-frequency components of the frame.
Experimental results demonstrate that our method outperforms the existing HNeRV method, achieving superior results in 96 percent of the videos.
arXiv Detail & Related papers (2024-09-27T07:30:12Z) - Bi-Level Spatial and Channel-aware Transformer for Learned Image Compression [0.0]
We propose a novel Transformer-based image compression method that enhances the transformation stage by considering frequency components within the feature map.
Our method integrates a novel Hybrid Spatial-Channel Attention Transformer Block (HSCATB), where a spatial-based branch independently handles high and low frequencies.
We also introduce a Mixed Local-Global Feed Forward Network (MLGFFN) within the Transformer block to enhance the extraction of diverse and rich information.
arXiv Detail & Related papers (2024-08-07T15:35:25Z) - Wave-Mamba: Wavelet State Space Model for Ultra-High-Definition Low-Light Image Enhancement [7.891750065129094]
We propose Wave-Mamba, a novel approach based on two pivotal insights derived from the wavelet domain.
Our method has demonstrated superior performance, significantly outshining current leading techniques.
arXiv Detail & Related papers (2024-08-02T14:01:34Z) - Compression-Realized Deep Structural Network for Video Quality Enhancement [78.13020206633524]
This paper focuses on the task of quality enhancement for compressed videos.
Most of the existing methods lack a structured design to optimally leverage the priors within compression codecs.
A new paradigm is urgently needed for a more conscious'' process of quality enhancement.
arXiv Detail & Related papers (2024-05-10T09:18:17Z) - HybridFlow: Infusing Continuity into Masked Codebook for Extreme Low-Bitrate Image Compression [51.04820313355164]
HyrbidFlow combines the continuous-feature-based and codebook-based streams to achieve both high perceptual quality and high fidelity under extreme lows.
Experimental results demonstrate superior performance across several datasets under extremely lows.
arXiv Detail & Related papers (2024-04-20T13:19:08Z) - Denoising Diffusion Error Correction Codes [92.10654749898927]
Recently, neural decoders have demonstrated their advantage over classical decoding techniques.
Recent state-of-the-art neural decoders suffer from high complexity and lack the important iterative scheme characteristic of many legacy decoders.
We propose to employ denoising diffusion models for the soft decoding of linear codes at arbitrary block lengths.
arXiv Detail & Related papers (2022-09-16T11:00:50Z) - Learned Video Compression via Heterogeneous Deformable Compensation
Network [78.72508633457392]
We propose a learned video compression framework via heterogeneous deformable compensation strategy (HDCVC) to tackle the problems of unstable compression performance.
More specifically, the proposed algorithm extracts features from the two adjacent frames to estimate content-Neighborhood heterogeneous deformable (HetDeform) kernel offsets.
Experimental results indicate that HDCVC achieves superior performance than the recent state-of-the-art learned video compression approaches.
arXiv Detail & Related papers (2022-07-11T02:31:31Z) - Neural JPEG: End-to-End Image Compression Leveraging a Standard JPEG
Encoder-Decoder [73.48927855855219]
We propose a system that learns to improve the encoding performance by enhancing its internal neural representations on both the encoder and decoder ends.
Experiments demonstrate that our approach successfully improves the rate-distortion performance over JPEG across various quality metrics.
arXiv Detail & Related papers (2022-01-27T20:20:03Z) - Ultra-low bitrate video conferencing using deep image animation [7.263312285502382]
We propose a novel deep learning approach for ultra-low video compression for video conferencing applications.
We employ deep neural networks to encode motion information as keypoint displacement and reconstruct the video signal at the decoder side.
arXiv Detail & Related papers (2020-12-01T09:06:34Z) - Multi-level Wavelet-based Generative Adversarial Network for Perceptual
Quality Enhancement of Compressed Video [51.631731922593225]
Existing methods mainly focus on enhancing the objective quality of compressed video while ignoring its perceptual quality.
We propose a novel generative adversarial network (GAN) based on multi-level wavelet packet transform (WPT) to enhance the perceptual quality of compressed video.
arXiv Detail & Related papers (2020-08-02T15:01:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.