High-Frequency Enhanced Hybrid Neural Representation for Video Compression
- URL: http://arxiv.org/abs/2411.06685v1
- Date: Mon, 11 Nov 2024 03:04:46 GMT
- Title: High-Frequency Enhanced Hybrid Neural Representation for Video Compression
- Authors: Li Yu, Zhihui Li, Jimin Xiao, Moncef Gabbouj,
- Abstract summary: This paper introduces a High-Frequency Enhanced Hybrid Neural Representation Network.
Our method focuses on leveraging high-frequency information to improve the synthesis of fine details by the network.
Experiments on the Bunny and UVG datasets demonstrate that our method outperforms other methods.
- Score: 32.38933743785333
- License:
- Abstract: Neural Representations for Videos (NeRV) have simplified the video codec process and achieved swift decoding speeds by encoding video content into a neural network, presenting a promising solution for video compression. However, existing work overlooks the crucial issue that videos reconstructed by these methods lack high-frequency details. To address this problem, this paper introduces a High-Frequency Enhanced Hybrid Neural Representation Network. Our method focuses on leveraging high-frequency information to improve the synthesis of fine details by the network. Specifically, we design a wavelet high-frequency encoder that incorporates Wavelet Frequency Decomposer (WFD) blocks to generate high-frequency feature embeddings. Next, we design the High-Frequency Feature Modulation (HFM) block, which leverages the extracted high-frequency embeddings to enhance the fitting process of the decoder. Finally, with the refined Harmonic decoder block and a Dynamic Weighted Frequency Loss, we further reduce the potential loss of high-frequency information. Experiments on the Bunny and UVG datasets demonstrate that our method outperforms other methods, showing notable improvements in detail preservation and compression performance.
Related papers
- Improving the Diffusability of Autoencoders [54.920783089085035]
Latent diffusion models have emerged as the leading approach for generating high-quality images and videos.
We perform a spectral analysis of modern autoencoders and identify inordinate high-frequency components in their latent spaces.
We hypothesize that this high-frequency component interferes with the coarse-to-fine nature of the diffusion synthesis process and hinders the generation quality.
arXiv Detail & Related papers (2025-02-20T18:45:44Z) - SNeRV: Spectra-preserving Neural Representation for Video [8.978061470104532]
We propose spectra-preserving NeRV (SNeRV) as a novel approach to enhance implicit video representations.
In this paper, we use 2D discrete wavelet transform (DWT) to decompose video into low-frequency (LF) and high-frequency (HF) features.
We demonstrate that SNeRV outperforms existing NeRV models in capturing fine details and achieves enhanced reconstruction.
arXiv Detail & Related papers (2025-01-03T07:57:38Z) - Neural Video Representation for Redundancy Reduction and Consistency Preservation [0.0]
Implicit neural representation (INR) embed various signals into neural networks.
We propose a video representation method that generates both the high-frequency and low-frequency components of the frame.
Experimental results demonstrate that our method outperforms the existing HNeRV method, achieving superior results in 96 percent of the videos.
arXiv Detail & Related papers (2024-09-27T07:30:12Z) - Wave-Mamba: Wavelet State Space Model for Ultra-High-Definition Low-Light Image Enhancement [7.891750065129094]
We propose Wave-Mamba, a novel approach based on two pivotal insights derived from the wavelet domain.
Our method has demonstrated superior performance, significantly outshining current leading techniques.
arXiv Detail & Related papers (2024-08-02T14:01:34Z) - Compression-Realized Deep Structural Network for Video Quality Enhancement [78.13020206633524]
This paper focuses on the task of quality enhancement for compressed videos.
Most of the existing methods lack a structured design to optimally leverage the priors within compression codecs.
A new paradigm is urgently needed for a more conscious'' process of quality enhancement.
arXiv Detail & Related papers (2024-05-10T09:18:17Z) - Denoising Diffusion Error Correction Codes [92.10654749898927]
Recently, neural decoders have demonstrated their advantage over classical decoding techniques.
Recent state-of-the-art neural decoders suffer from high complexity and lack the important iterative scheme characteristic of many legacy decoders.
We propose to employ denoising diffusion models for the soft decoding of linear codes at arbitrary block lengths.
arXiv Detail & Related papers (2022-09-16T11:00:50Z) - Learned Video Compression via Heterogeneous Deformable Compensation
Network [78.72508633457392]
We propose a learned video compression framework via heterogeneous deformable compensation strategy (HDCVC) to tackle the problems of unstable compression performance.
More specifically, the proposed algorithm extracts features from the two adjacent frames to estimate content-Neighborhood heterogeneous deformable (HetDeform) kernel offsets.
Experimental results indicate that HDCVC achieves superior performance than the recent state-of-the-art learned video compression approaches.
arXiv Detail & Related papers (2022-07-11T02:31:31Z) - Neural JPEG: End-to-End Image Compression Leveraging a Standard JPEG
Encoder-Decoder [73.48927855855219]
We propose a system that learns to improve the encoding performance by enhancing its internal neural representations on both the encoder and decoder ends.
Experiments demonstrate that our approach successfully improves the rate-distortion performance over JPEG across various quality metrics.
arXiv Detail & Related papers (2022-01-27T20:20:03Z) - Ultra-low bitrate video conferencing using deep image animation [7.263312285502382]
We propose a novel deep learning approach for ultra-low video compression for video conferencing applications.
We employ deep neural networks to encode motion information as keypoint displacement and reconstruct the video signal at the decoder side.
arXiv Detail & Related papers (2020-12-01T09:06:34Z) - Multi-level Wavelet-based Generative Adversarial Network for Perceptual
Quality Enhancement of Compressed Video [51.631731922593225]
Existing methods mainly focus on enhancing the objective quality of compressed video while ignoring its perceptual quality.
We propose a novel generative adversarial network (GAN) based on multi-level wavelet packet transform (WPT) to enhance the perceptual quality of compressed video.
arXiv Detail & Related papers (2020-08-02T15:01:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.