Learning Dynamic Point Cloud Compression via Hierarchical Inter-frame
Block Matching
- URL: http://arxiv.org/abs/2305.05356v2
- Date: Tue, 16 May 2023 05:25:00 GMT
- Title: Learning Dynamic Point Cloud Compression via Hierarchical Inter-frame
Block Matching
- Authors: Shuting Xia, Tingyu Fan, Yiling Xu, Jenq-Neng Hwang, Zhu Li
- Abstract summary: 3D dynamic point cloud (DPC) compression relies on mining its temporal context.
This paper proposes a learning-based DPC compression framework via hierarchical block-matching-based inter-prediction module.
- Score: 35.80653765524654
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D dynamic point cloud (DPC) compression relies on mining its temporal
context, which faces significant challenges due to DPC's sparsity and
non-uniform structure. Existing methods are limited in capturing sufficient
temporal dependencies. Therefore, this paper proposes a learning-based DPC
compression framework via hierarchical block-matching-based inter-prediction
module to compensate and compress the DPC geometry in latent space.
Specifically, we propose a hierarchical motion estimation and motion
compensation (Hie-ME/MC) framework for flexible inter-prediction, which
dynamically selects the granularity of optical flow to encapsulate the motion
information accurately. To improve the motion estimation efficiency of the
proposed inter-prediction module, we further design a KNN-attention block
matching (KABM) network that determines the impact of potential corresponding
points based on the geometry and feature correlation. Finally, we compress the
residual and the multi-scale optical flow with a fully-factorized deep entropy
model. The experiment result on the MPEG-specified Owlii Dynamic Human Dynamic
Point Cloud (Owlii) dataset shows that our framework outperforms the previous
state-of-the-art methods and the MPEG standard V-PCC v18 in inter-frame
low-delay mode.
Related papers
- U-Motion: Learned Point Cloud Video Compression with U-Structured Motion Estimation [9.528405963599997]
Point cloud video (PCV) is a versatile 3D representation of dynamic scenes with many emerging applications.
This paper introduces U-Motion, a learning-based compression scheme for both PCV geometry and attributes.
arXiv Detail & Related papers (2024-11-21T07:17:01Z) - DA-Flow: Dual Attention Normalizing Flow for Skeleton-based Video Anomaly Detection [52.74152717667157]
We propose a lightweight module called Dual Attention Module (DAM) for capturing cross-dimension interaction relationships in-temporal skeletal data.
It employs the frame attention mechanism to identify the most significant frames and the skeleton attention mechanism to capture broader relationships across fixed partitions with minimal parameters and flops.
arXiv Detail & Related papers (2024-06-05T06:18:03Z) - On Exploring PDE Modeling for Point Cloud Video Representation Learning [48.02197741709501]
We introduce Motion PointNet composed of a PointNet-like encoder and a PDE-solving module.
Our Motion PointNet achieves an impressive accuracy of 97.52% on the MSRAction-3D dataset.
arXiv Detail & Related papers (2024-04-06T19:50:48Z) - Motion-Aware Video Frame Interpolation [49.49668436390514]
We introduce a Motion-Aware Video Frame Interpolation (MA-VFI) network, which directly estimates intermediate optical flow from consecutive frames.
It not only extracts global semantic relationships and spatial details from input frames with different receptive fields, but also effectively reduces the required computational cost and complexity.
arXiv Detail & Related papers (2024-02-05T11:00:14Z) - Spatial-Temporal Transformer based Video Compression Framework [44.723459144708286]
We propose a novel Spatial-Temporal Transformer based Video Compression (STT-VC) framework.
It contains a Relaxed Deformable Transformer (RDT) with Uformer based offsets estimation for motion estimation and compensation, a Multi-Granularity Prediction (MGP) module based on multi-reference frames for prediction refinement, and a Spatial Feature Distribution prior based Transformer (SFD-T) for efficient temporal-spatial joint residual compression.
Experimental results demonstrate that our method achieves the best result with 13.5% BD-Rate saving over VTM.
arXiv Detail & Related papers (2023-09-21T09:23:13Z) - Dynamic Kernel-Based Adaptive Spatial Aggregation for Learned Image
Compression [63.56922682378755]
We focus on extending spatial aggregation capability and propose a dynamic kernel-based transform coding.
The proposed adaptive aggregation generates kernel offsets to capture valid information in the content-conditioned range to help transform.
Experimental results demonstrate that our method achieves superior rate-distortion performance on three benchmarks compared to the state-of-the-art learning-based methods.
arXiv Detail & Related papers (2023-08-17T01:34:51Z) - Learned Video Compression via Heterogeneous Deformable Compensation
Network [78.72508633457392]
We propose a learned video compression framework via heterogeneous deformable compensation strategy (HDCVC) to tackle the problems of unstable compression performance.
More specifically, the proposed algorithm extracts features from the two adjacent frames to estimate content-Neighborhood heterogeneous deformable (HetDeform) kernel offsets.
Experimental results indicate that HDCVC achieves superior performance than the recent state-of-the-art learned video compression approaches.
arXiv Detail & Related papers (2022-07-11T02:31:31Z) - D-DPCC: Deep Dynamic Point Cloud Compression via 3D Motion Prediction [18.897023700334458]
This paper proposes a novel 3D sparse convolution-based Deep Dynamic Point Cloud Compression network.
It compensates and compress the DPC geometry with 3D motion estimation and motion compensation in the feature space.
The experimental result shows that the proposed D-DPCC framework achieves an average 76% BD-Rate (Bjontegaard Delta Rate) gains against state-of-the-art Video-based Point Cloud Compression (V-PCC) v13 in inter mode.
arXiv Detail & Related papers (2022-05-02T18:10:45Z) - 4DAC: Learning Attribute Compression for Dynamic Point Clouds [37.447460254690135]
We study the attribute (e.g., color) compression of dynamic point clouds and present a learning-based framework, termed 4DAC.
To reduce temporal redundancy within data, we first build the 3D motion estimation and motion compensation modules with deep neural networks.
In addition, we also propose a deep conditional entropy model to estimate the probability distribution of the transformed coefficients.
arXiv Detail & Related papers (2022-04-25T15:30:06Z) - Spatiotemporal Entropy Model is All You Need for Learned Video
Compression [9.227865598115024]
We propose a framework to compress raw-pixel frames (rather than residual images)
An entropy model is used to estimate thetemporal redundancy in a latent space rather than pixel level.
Experiments showed that the proposed method outperforms state-of-the-art (SOTA) performance.
arXiv Detail & Related papers (2021-04-13T10:38:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.