D-DPCC: Deep Dynamic Point Cloud Compression via 3D Motion Prediction
- URL: http://arxiv.org/abs/2205.01135v1
- Date: Mon, 2 May 2022 18:10:45 GMT
- Title: D-DPCC: Deep Dynamic Point Cloud Compression via 3D Motion Prediction
- Authors: Tingyu Fan, Linyao Gao, Yiling Xu, Zhu Li and Dong Wang
- Abstract summary: This paper proposes a novel 3D sparse convolution-based Deep Dynamic Point Cloud Compression network.
It compensates and compress the DPC geometry with 3D motion estimation and motion compensation in the feature space.
The experimental result shows that the proposed D-DPCC framework achieves an average 76% BD-Rate (Bjontegaard Delta Rate) gains against state-of-the-art Video-based Point Cloud Compression (V-PCC) v13 in inter mode.
- Score: 18.897023700334458
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The non-uniformly distributed nature of the 3D dynamic point cloud (DPC)
brings significant challenges to its high-efficient inter-frame compression.
This paper proposes a novel 3D sparse convolution-based Deep Dynamic Point
Cloud Compression (D-DPCC) network to compensate and compress the DPC geometry
with 3D motion estimation and motion compensation in the feature space. In the
proposed D-DPCC network, we design a {\it Multi-scale Motion Fusion} (MMF)
module to accurately estimate the 3D optical flow between the feature
representations of adjacent point cloud frames. Specifically, we utilize a 3D
sparse convolution-based encoder to obtain the latent representation for motion
estimation in the feature space and introduce the proposed MMF module for fused
3D motion embedding. Besides, for motion compensation, we propose a 3D {\it
Adaptively Weighted Interpolation} (3DAWI) algorithm with a penalty coefficient
to adaptively decrease the impact of distant neighbors. We compress the motion
embedding and the residual with a lossy autoencoder-based network. To our
knowledge, this paper is the first work proposing an end-to-end deep dynamic
point cloud compression framework. The experimental result shows that the
proposed D-DPCC framework achieves an average 76\% BD-Rate (Bjontegaard Delta
Rate) gains against state-of-the-art Video-based Point Cloud Compression
(V-PCC) v13 in inter mode.
Related papers
- Dynamic 3D Point Cloud Sequences as 2D Videos [81.46246338686478]
3D point cloud sequences serve as one of the most common and practical representation modalities of real-world environments.
We propose a novel generic representation called textitStructured Point Cloud Videos (SPCVs)
SPCVs re-organizes a point cloud sequence as a 2D video with spatial smoothness and temporal consistency, where the pixel values correspond to the 3D coordinates of points.
arXiv Detail & Related papers (2024-03-02T08:18:57Z) - 4DSR-GCN: 4D Video Point Cloud Upsampling using Graph Convolutional
Networks [29.615723135027096]
We propose a new solution for upscaling and restoration of time-varying 3D video point clouds after they have been compressed.
Our model consists of a specifically designed Graph Convolutional Network (GCN) that combines Dynamic Edge Convolution and Graph Attention Networks.
arXiv Detail & Related papers (2023-06-01T18:43:16Z) - Learning Dynamic Point Cloud Compression via Hierarchical Inter-frame
Block Matching [35.80653765524654]
3D dynamic point cloud (DPC) compression relies on mining its temporal context.
This paper proposes a learning-based DPC compression framework via hierarchical block-matching-based inter-prediction module.
arXiv Detail & Related papers (2023-05-09T11:44:13Z) - DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets [95.84755169585492]
We present Dynamic Sparse Voxel Transformer (DSVT), a single-stride window-based voxel Transformer backbone for outdoor 3D perception.
Our model achieves state-of-the-art performance with a broad range of 3D perception tasks.
arXiv Detail & Related papers (2023-01-15T09:31:58Z) - 4DAC: Learning Attribute Compression for Dynamic Point Clouds [37.447460254690135]
We study the attribute (e.g., color) compression of dynamic point clouds and present a learning-based framework, termed 4DAC.
To reduce temporal redundancy within data, we first build the 3D motion estimation and motion compensation modules with deep neural networks.
In addition, we also propose a deep conditional entropy model to estimate the probability distribution of the transformed coefficients.
arXiv Detail & Related papers (2022-04-25T15:30:06Z) - A Conditional Point Diffusion-Refinement Paradigm for 3D Point Cloud
Completion [69.32451612060214]
Real-scanned 3D point clouds are often incomplete, and it is important to recover complete point clouds for downstream applications.
Most existing point cloud completion methods use Chamfer Distance (CD) loss for training.
We propose a novel Point Diffusion-Refinement (PDR) paradigm for point cloud completion.
arXiv Detail & Related papers (2021-12-07T06:59:06Z) - DeepCompress: Efficient Point Cloud Geometry Compression [1.808877001896346]
We propose a more efficient deep learning-based encoder architecture for point clouds compression.
We show that incorporating the learned activation function from Efficient Neural Image Compression (CENIC) yields dramatic gains in efficiency and performance.
Our proposed modifications outperform the baseline approaches by a small margin in terms of Bjontegard delta rate and PSNR values.
arXiv Detail & Related papers (2021-06-02T23:18:11Z) - Spherical Interpolated Convolutional Network with Distance-Feature
Density for 3D Semantic Segmentation of Point Clouds [24.85151376535356]
Spherical interpolated convolution operator is proposed to replace the traditional grid-shaped 3D convolution operator.
The proposed method achieves good performance on the ScanNet dataset and Paris-Lille-3D dataset.
arXiv Detail & Related papers (2020-11-27T15:35:12Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR
Segmentation [81.02742110604161]
State-of-the-art methods for large-scale driving-scene LiDAR segmentation often project the point clouds to 2D space and then process them via 2D convolution.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pat-tern.
Our method achieves the 1st place in the leaderboard of Semantic KITTI and outperforms existing methods on nuScenes with a noticeable margin, about 4%.
arXiv Detail & Related papers (2020-11-19T18:53:11Z) - Cylinder3D: An Effective 3D Framework for Driving-scene LiDAR Semantic
Segmentation [87.54570024320354]
State-of-the-art methods for large-scale driving-scene LiDAR semantic segmentation often project and process the point clouds in the 2D space.
A straightforward solution to tackle the issue of 3D-to-2D projection is to keep the 3D representation and process the points in the 3D space.
We develop a 3D cylinder partition and a 3D cylinder convolution based framework, termed as Cylinder3D, which exploits the 3D topology relations and structures of driving-scene point clouds.
arXiv Detail & Related papers (2020-08-04T13:56:19Z) - Pseudo-LiDAR Point Cloud Interpolation Based on 3D Motion Representation
and Spatial Supervision [68.35777836993212]
We propose a Pseudo-LiDAR point cloud network to generate temporally and spatially high-quality point cloud sequences.
By exploiting the scene flow between point clouds, the proposed network is able to learn a more accurate representation of the 3D spatial motion relationship.
arXiv Detail & Related papers (2020-06-20T03:11:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.