Related papers: D-DPCC: Deep Dynamic Point Cloud Compression via 3D Motion Prediction

D-DPCC: Deep Dynamic Point Cloud Compression via 3D Motion Prediction

URL: http://arxiv.org/abs/2205.01135v1
Date: Mon, 2 May 2022 18:10:45 GMT
Title: D-DPCC: Deep Dynamic Point Cloud Compression via 3D Motion Prediction
Authors: Tingyu Fan, Linyao Gao, Yiling Xu, Zhu Li and Dong Wang
Abstract summary: This paper proposes a novel 3D sparse convolution-based Deep Dynamic Point Cloud Compression network. It compensates and compress the DPC geometry with 3D motion estimation and motion compensation in the feature space. The experimental result shows that the proposed D-DPCC framework achieves an average 76% BD-Rate (Bjontegaard Delta Rate) gains against state-of-the-art Video-based Point Cloud Compression (V-PCC) v13 in inter mode.
Score: 18.897023700334458
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The non-uniformly distributed nature of the 3D dynamic point cloud (DPC) brings significant challenges to its high-efficient inter-frame compression. This paper proposes a novel 3D sparse convolution-based Deep Dynamic Point Cloud Compression (D-DPCC) network to compensate and compress the DPC geometry with 3D motion estimation and motion compensation in the feature space. In the proposed D-DPCC network, we design a {\it Multi-scale Motion Fusion} (MMF) module to accurately estimate the 3D optical flow between the feature representations of adjacent point cloud frames. Specifically, we utilize a 3D sparse convolution-based encoder to obtain the latent representation for motion estimation in the feature space and introduce the proposed MMF module for fused 3D motion embedding. Besides, for motion compensation, we propose a 3D {\it Adaptively Weighted Interpolation} (3DAWI) algorithm with a penalty coefficient to adaptively decrease the impact of distant neighbors. We compress the motion embedding and the residual with a lossy autoencoder-based network. To our knowledge, this paper is the first work proposing an end-to-end deep dynamic point cloud compression framework. The experimental result shows that the proposed D-DPCC framework achieves an average 76\% BD-Rate (Bjontegaard Delta Rate) gains against state-of-the-art Video-based Point Cloud Compression (V-PCC) v13 in inter mode.

Related papers

4DGC: Rate-Aware 4D Gaussian Compression for Efficient Streamable Free-Viewpoint Video [56.04182926886754]
3D Gaussian Splatting (3DGS) has substantial potential for enabling photorealistic Free-Viewpoint Video (FVV) experiences. Existing methods typically handle dynamic 3DGS representation and compression separately, motion information and the rate-distortion trade-off during training. We propose 4DGC, a rate-aware 4D Gaussian compression framework that significantly reduces storage size while maintaining superior RD performance for FVV.
arXiv Detail & Related papers (2025-03-24T08:05:27Z)
U-Motion: Learned Point Cloud Video Compression with U-Structured Motion Estimation [9.528405963599997]
Point cloud video (PCV) is a versatile 3D representation of dynamic scenes with many emerging applications. This paper introduces U-Motion, a learning-based compression scheme for both PCV geometry and attributes.
arXiv Detail & Related papers (2024-11-21T07:17:01Z)
GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation [75.39457097832113]
This paper introduces a novel 3D generation framework, offering scalable, high-quality 3D generation with an interactive Point Cloud-structured Latent space. Our framework employs a Variational Autoencoder with multi-view posed RGB-D(epth)-N(ormal) renderings as input, using a unique latent space design that preserves 3D shape information. The proposed method, GaussianAnything, supports multi-modal conditional 3D generation, allowing for point cloud, caption, and single/multi-view image inputs.
arXiv Detail & Related papers (2024-11-12T18:59:32Z)
Dynamic 3D Point Cloud Sequences as 2D Videos [81.46246338686478]
3D point cloud sequences serve as one of the most common and practical representation modalities of real-world environments. We propose a novel generic representation called textitStructured Point Cloud Videos (SPCVs) SPCVs re-organizes a point cloud sequence as a 2D video with spatial smoothness and temporal consistency, where the pixel values correspond to the 3D coordinates of points.
arXiv Detail & Related papers (2024-03-02T08:18:57Z)
4DSR-GCN: 4D Video Point Cloud Upsampling using Graph Convolutional Networks [29.615723135027096]
We propose a new solution for upscaling and restoration of time-varying 3D video point clouds after they have been compressed. Our model consists of a specifically designed Graph Convolutional Network (GCN) that combines Dynamic Edge Convolution and Graph Attention Networks.
arXiv Detail & Related papers (2023-06-01T18:43:16Z)
Learning Dynamic Point Cloud Compression via Hierarchical Inter-frame Block Matching [35.80653765524654]
3D dynamic point cloud (DPC) compression relies on mining its temporal context. This paper proposes a learning-based DPC compression framework via hierarchical block-matching-based inter-prediction module.
arXiv Detail & Related papers (2023-05-09T11:44:13Z)
4DAC: Learning Attribute Compression for Dynamic Point Clouds [37.447460254690135]
We study the attribute (e.g., color) compression of dynamic point clouds and present a learning-based framework, termed 4DAC. To reduce temporal redundancy within data, we first build the 3D motion estimation and motion compensation modules with deep neural networks. In addition, we also propose a deep conditional entropy model to estimate the probability distribution of the transformed coefficients.
arXiv Detail & Related papers (2022-04-25T15:30:06Z)
A Conditional Point Diffusion-Refinement Paradigm for 3D Point Cloud Completion [69.32451612060214]
Real-scanned 3D point clouds are often incomplete, and it is important to recover complete point clouds for downstream applications. Most existing point cloud completion methods use Chamfer Distance (CD) loss for training. We propose a novel Point Diffusion-Refinement (PDR) paradigm for point cloud completion.
arXiv Detail & Related papers (2021-12-07T06:59:06Z)
DeepCompress: Efficient Point Cloud Geometry Compression [1.808877001896346]
We propose a more efficient deep learning-based encoder architecture for point clouds compression. We show that incorporating the learned activation function from Efficient Neural Image Compression (CENIC) yields dramatic gains in efficiency and performance. Our proposed modifications outperform the baseline approaches by a small margin in terms of Bjontegard delta rate and PSNR values.
arXiv Detail & Related papers (2021-06-02T23:18:11Z)
Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation [81.02742110604161]
State-of-the-art methods for large-scale driving-scene LiDAR segmentation often project the point clouds to 2D space and then process them via 2D convolution. We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pat-tern. Our method achieves the 1st place in the leaderboard of Semantic KITTI and outperforms existing methods on nuScenes with a noticeable margin, about 4%.
arXiv Detail & Related papers (2020-11-19T18:53:11Z)
Cylinder3D: An Effective 3D Framework for Driving-scene LiDAR Semantic Segmentation [87.54570024320354]
State-of-the-art methods for large-scale driving-scene LiDAR semantic segmentation often project and process the point clouds in the 2D space. A straightforward solution to tackle the issue of 3D-to-2D projection is to keep the 3D representation and process the points in the 3D space. We develop a 3D cylinder partition and a 3D cylinder convolution based framework, termed as Cylinder3D, which exploits the 3D topology relations and structures of driving-scene point clouds.
arXiv Detail & Related papers (2020-08-04T13:56:19Z)
Pseudo-LiDAR Point Cloud Interpolation Based on 3D Motion Representation and Spatial Supervision [68.35777836993212]
We propose a Pseudo-LiDAR point cloud network to generate temporally and spatially high-quality point cloud sequences. By exploiting the scene flow between point clouds, the proposed network is able to learn a more accurate representation of the 3D spatial motion relationship.
arXiv Detail & Related papers (2020-06-20T03:11:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.