Diff-PCC: Diffusion-based Neural Compression for 3D Point Clouds
- URL: http://arxiv.org/abs/2408.10543v1
- Date: Tue, 20 Aug 2024 04:55:29 GMT
- Title: Diff-PCC: Diffusion-based Neural Compression for 3D Point Clouds
- Authors: Kai Liu, Kang You, Pan Gao,
- Abstract summary: We introduce the first diffusion-based point cloud compression method, dubbed Diff-PCC, to leverage the expressive power of the diffusion model for generative and aesthetically superior decoding.
Experiments demonstrate that the proposed Diff-PCC achieves state-of-the-art compression performance.
- Score: 12.45444994957525
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Stable diffusion networks have emerged as a groundbreaking development for their ability to produce realistic and detailed visual content. This characteristic renders them ideal decoders, capable of producing high-quality and aesthetically pleasing reconstructions. In this paper, we introduce the first diffusion-based point cloud compression method, dubbed Diff-PCC, to leverage the expressive power of the diffusion model for generative and aesthetically superior decoding. Different from the conventional autoencoder fashion, a dual-space latent representation is devised in this paper, in which a compressor composed of two independent encoding backbones is considered to extract expressive shape latents from distinct latent spaces. At the decoding side, a diffusion-based generator is devised to produce high-quality reconstructions by considering the shape latents as guidance to stochastically denoise the noisy point clouds. Experiments demonstrate that the proposed Diff-PCC achieves state-of-the-art compression performance (e.g., 7.711 dB BD-PSNR gains against the latest G-PCC standard at ultra-low bitrate) while attaining superior subjective quality. Source code will be made publicly available.
Related papers
- GIViC: Generative Implicit Video Compression [11.908506692749743]
Implicit Video Compression ( GIViC) is inspired by the characteristics that INRs share with large language diffusion models in exploiting long-term dependencies.
A novel Gene Gated Linear Attention-based transformer (HGLA) is also integrated into the framework, which dual-factorizes global dependency modeling.
As far as we are aware GIViC is the first INR-based video that outperforms VTM coding configuration.
arXiv Detail & Related papers (2025-03-25T12:39:45Z) - PerCoV2: Improved Ultra-Low Bit-Rate Perceptual Image Compression with Implicit Hierarchical Masked Image Modeling [0.030448596365296413]
PerCoV2 is a novel ultra-low bit-rate perceptual image compression system.
PerCoV2 is designed for bandwidth- and storage-constrained applications.
arXiv Detail & Related papers (2025-03-12T13:14:51Z) - REGEN: Learning Compact Video Embedding with (Re-)Generative Decoder [52.698595889988766]
We present a novel perspective on learning video embedders for generative modeling.
Rather than requiring an exact reproduction of an input video, an effective embedder should focus on visually plausible reconstructions.
We propose replacing the conventional encoder-decoder video embedder with an encoder-generator framework.
arXiv Detail & Related papers (2025-03-11T17:51:07Z) - Hierarchical Semantic Compression for Consistent Image Semantic Restoration [62.97519327310638]
We propose a novel hierarchical semantic compression (HSC) framework that purely operates within intrinsic semantic spaces from generative models.
Experimental results demonstrate that the proposed HSC framework achieves the state-of-the-art performance on subjective quality and consistency for human vision.
arXiv Detail & Related papers (2025-02-24T03:20:44Z) - Implicit Neural Compression of Point Clouds [58.45774938982386]
NeRC$textbf3$ is a novel point cloud compression framework leveraging implicit neural representations to handle both geometry and attributes.
For dynamic point clouds, 4D-NeRC$textbf3$ demonstrates superior geometry compression compared to state-of-the-art G-PCC and V-PCC standards.
arXiv Detail & Related papers (2024-12-11T03:22:00Z) - Rendering-Oriented 3D Point Cloud Attribute Compression using Sparse Tensor-based Transformer [52.40992954884257]
3D visualization techniques have fundamentally transformed how we interact with digital content.
Massive data size of point clouds presents significant challenges in data compression.
We propose an end-to-end deep learning framework that seamlessly integrates PCAC with differentiable rendering.
arXiv Detail & Related papers (2024-11-12T16:12:51Z) - Diffusion-based Extreme Image Compression with Compressed Feature Initialization [29.277211609920155]
We present Relay Residual Diffusion Extreme Image Compression (RDEIC)
We first use the compressed latent features of the image with added noise, instead of pure noise, as the starting point to eliminate the unnecessary initial stages of the denoising process.
We show that the proposed RDEIC achieves state-of-the-art visual quality and outperforms existing diffusion-based extreme image compression methods in both fidelity and efficiency.
arXiv Detail & Related papers (2024-10-03T16:24:20Z) - Once-for-All: Controllable Generative Image Compression with Dynamic Granularity Adaption [57.056311855630916]
We propose a Controllable Generative Image Compression framework, Control-GIC.
It is capable of fine-grained adaption across a broad spectrum while ensuring high-fidelity and generality compression.
We develop a conditional conditionalization that can trace back to historic encoded multi-granularity representations.
arXiv Detail & Related papers (2024-06-02T14:22:09Z) - Compression-Realized Deep Structural Network for Video Quality Enhancement [78.13020206633524]
This paper focuses on the task of quality enhancement for compressed videos.
Most of the existing methods lack a structured design to optimally leverage the priors within compression codecs.
A new paradigm is urgently needed for a more conscious'' process of quality enhancement.
arXiv Detail & Related papers (2024-05-10T09:18:17Z) - Correcting Diffusion-Based Perceptual Image Compression with Privileged End-to-End Decoder [49.01721042973929]
This paper presents a diffusion-based image compression method that employs a privileged end-to-end decoder model as correction.
Experiments demonstrate the superiority of our method in both distortion and perception compared with previous perceptual compression methods.
arXiv Detail & Related papers (2024-04-07T10:57:54Z) - Unifying Generation and Compression: Ultra-low bitrate Image Coding Via
Multi-stage Transformer [35.500720262253054]
This paper introduces a novel Unified Image Generation-Compression (UIGC) paradigm, merging the processes of generation and compression.
A key feature of the UIGC framework is the adoption of vector-quantized (VQ) image models for tokenization.
Experiments demonstrate the superiority of the proposed UIGC framework over existing codecs in perceptual quality and human perception.
arXiv Detail & Related papers (2024-03-06T14:27:02Z) - CCD-3DR: Consistent Conditioning in Diffusion for Single-Image 3D
Reconstruction [81.98244738773766]
We present CCD-3DR, which exploits a novel centered diffusion probabilistic model for consistent local feature conditioning.
CCD-3DR outperforms all competitors by a large margin, with over 40% improvement.
arXiv Detail & Related papers (2023-08-15T15:27:42Z) - Learned Video Compression via Heterogeneous Deformable Compensation
Network [78.72508633457392]
We propose a learned video compression framework via heterogeneous deformable compensation strategy (HDCVC) to tackle the problems of unstable compression performance.
More specifically, the proposed algorithm extracts features from the two adjacent frames to estimate content-Neighborhood heterogeneous deformable (HetDeform) kernel offsets.
Experimental results indicate that HDCVC achieves superior performance than the recent state-of-the-art learned video compression approaches.
arXiv Detail & Related papers (2022-07-11T02:31:31Z) - Generalized Octave Convolutions for Learned Multi-Frequency Image
Compression [20.504561050200365]
We propose the first learned multi-frequency image compression and entropy coding approach.
It is based on the recently developed octave convolutions to factorize the latents into high and low frequency (resolution) components.
We show that the proposed generalized octave convolution can improve the performance of other auto-encoder-based computer vision tasks.
arXiv Detail & Related papers (2020-02-24T01:35:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.