Related papers: Inter-Frame Compression for Dynamic Point Cloud Geometry Coding

Inter-Frame Compression for Dynamic Point Cloud Geometry Coding

URL: http://arxiv.org/abs/2207.12554v2
Date: Mon, 2 Sep 2024 22:49:21 GMT
Title: Inter-Frame Compression for Dynamic Point Cloud Geometry Coding
Authors: Anique Akhtar, Zhu Li, Geert Van der Auwera,
Abstract summary: We propose a lossy compression scheme that predicts the latent representation of the current frame using the previous frame. The proposed network utilizes convolutions with hierarchical multiscale 3D feature learning to encode the current frame. The proposed method achieves more than 88% BD-Rate (Bjontegaard Delta Rate) reduction against G-PCCv20 Octree.
Score: 14.79613731546357
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Efficient point cloud compression is essential for applications like virtual and mixed reality, autonomous driving, and cultural heritage. This paper proposes a deep learning-based inter-frame encoding scheme for dynamic point cloud geometry compression. We propose a lossy geometry compression scheme that predicts the latent representation of the current frame using the previous frame by employing a novel feature space inter-prediction network. The proposed network utilizes sparse convolutions with hierarchical multiscale 3D feature learning to encode the current frame using the previous frame. The proposed method introduces a novel predictor network for motion compensation in the feature domain to map the latent representation of the previous frame to the coordinates of the current frame to predict the current frame's feature embedding. The framework transmits the residual of the predicted features and the actual features by compressing them using a learned probabilistic factorized entropy model. At the receiver, the decoder hierarchically reconstructs the current frame by progressively rescaling the feature embedding. The proposed framework is compared to the state-of-the-art Video-based Point Cloud Compression (V-PCC) and Geometry-based Point Cloud Compression (G-PCC) schemes standardized by the Moving Picture Experts Group (MPEG). The proposed method achieves more than 88% BD-Rate (Bjontegaard Delta Rate) reduction against G-PCCv20 Octree, more than 56% BD-Rate savings against G-PCCv20 Trisoup, more than 62% BD-Rate reduction against V-PCC intra-frame encoding mode, and more than 52% BD-Rate savings against V-PCC P-frame-based inter-frame encoding mode using HEVC. These significant performance gains are cross-checked and verified in the MPEG working group.

Related papers

Plug-and-Play Versatile Compressed Video Enhancement [57.62582951699999]
Video compression effectively reduces the size of files, making it possible for real-time cloud computing. However, it comes at the cost of visual quality, challenges the robustness of downstream vision models. We present a versatile-aware enhancement framework that adaptively enhance videos under different compression settings.
arXiv Detail & Related papers (2025-04-21T18:39:31Z)
Rendering-Oriented 3D Point Cloud Attribute Compression using Sparse Tensor-based Transformer [52.40992954884257]
3D visualization techniques have fundamentally transformed how we interact with digital content. Massive data size of point clouds presents significant challenges in data compression. We propose an end-to-end deep learning framework that seamlessly integrates PCAC with differentiable rendering.
arXiv Detail & Related papers (2024-11-12T16:12:51Z)
Improved Video VAE for Latent Video Diffusion Model [55.818110540710215]
Video Autoencoder (VAE) aims to compress pixel data into low-dimensional latent space, playing an important role in OpenAI's Sora. Most of existing VAEs inflate a pretrained image VAE into the 3D causal structure for temporal-spatial compression. We propose a new KTC architecture and a group causal convolution (GCConv) module to further improve video VAE (IV-VAE)
arXiv Detail & Related papers (2024-11-10T12:43:38Z)
High-Efficiency Neural Video Compression via Hierarchical Predictive Learning [27.41398149573729]
Enhanced Deep Hierarchical Video Compression-DHVC 2.0- introduces superior compression performance and impressive complexity efficiency. Uses hierarchical predictive coding to transform each video frame into multiscale representations. Supports transmission-friendly progressive decoding, making it particularly advantageous for networked video applications in the presence of packet loss.
arXiv Detail & Related papers (2024-10-03T15:40:58Z)
SPAC: Sampling-based Progressive Attribute Compression for Dense Point Clouds [51.313922535437726]
We propose an end-to-end compression method for dense point clouds. The proposed method combines a frequency sampling module, an adaptive scale feature extraction module with geometry assistance, and a global hyperprior entropy model.
arXiv Detail & Related papers (2024-09-16T13:59:43Z)
IBVC: Interpolation-driven B-frame Video Compression [68.18440522300536]
B-frame video compression aims to adopt bi-directional motion estimation and motion compensation (MEMC) coding for middle frame reconstruction. Previous learned approaches often directly extend neural P-frame codecs to B-frame relying on bi-directional optical-flow estimation. We propose a simple yet effective structure called Interpolation-B-frame Video Compression (IBVC) to address these issues.
arXiv Detail & Related papers (2023-09-25T02:45:51Z)
VNVC: A Versatile Neural Video Coding Framework for Efficient Human-Machine Vision [59.632286735304156]
It is more efficient to enhance/analyze the coded representations directly without decoding them into pixels. We propose a versatile neural video coding (VNVC) framework, which targets learning compact representations to support both reconstruction and direct enhancement/analysis.
arXiv Detail & Related papers (2023-06-19T03:04:57Z)
Dynamic Point Cloud Geometry Compression Using Multiscale Inter Conditional Coding [27.013814232906817]
This work extends the Multiscale Sparse Representation (MSR) framework developed for Point Cloud Geometry Compression (PCGC) to support the dynamic PCGC. The reconstruction of the preceding Point Cloud Geometry (PCG) frame is progressively downscaled to generate multiscale temporal priors.
arXiv Detail & Related papers (2023-01-28T11:34:06Z)
FFNeRV: Flow-Guided Frame-Wise Neural Representations for Videos [5.958701846880935]
We propose FFNeRV, a novel method for incorporating flow information into frame-wise representations to exploit the temporal redundancy across the frames in videos. With model compression techniques, FFNeRV outperforms widely-used standard video codecs (H.264 and HEVC) and performs on par with state-of-the-art video compression algorithms.
arXiv Detail & Related papers (2022-12-23T12:51:42Z)
Multiscale Point Cloud Geometry Compression [29.605320327889142]
We propose a multiscale-to-end learning framework which hierarchically reconstructs the 3D Point Cloud Geometry. The framework is developed on top of a sparse convolution based autoencoder for point cloud compression and reconstruction.
arXiv Detail & Related papers (2020-11-07T16:11:16Z)
Learning for Video Compression with Recurrent Auto-Encoder and Recurrent Probability Model [164.7489982837475]
This paper proposes a Recurrent Learned Video Compression (RLVC) approach with the Recurrent Auto-Encoder (RAE) and Recurrent Probability Model ( RPM) The RAE employs recurrent cells in both the encoder and decoder to exploit the temporal correlation among video frames. Our approach achieves the state-of-the-art learned video compression performance in terms of both PSNR and MS-SSIM.
arXiv Detail & Related papers (2020-06-24T08:46:33Z)
End-to-End Learning for Video Frame Compression with Self-Attention [25.23586503813838]
We propose an end-to-end learned system for compressing video frames. Our system learns deep embeddings of frames and encodes their difference in latent space. In our experiments, we show that the proposed system achieves high compression rates and high objective visual quality.
arXiv Detail & Related papers (2020-04-20T12:11:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.