Enhancing context models for point cloud geometry compression with context feature residuals and multi-loss
- URL: http://arxiv.org/abs/2407.08520v1
- Date: Thu, 11 Jul 2024 14:08:37 GMT
- Title: Enhancing context models for point cloud geometry compression with context feature residuals and multi-loss
- Authors: Chang Sun, Hui Yuan, Shuai Li, Xin Lu, Raouf Hamzaoui,
- Abstract summary: In point cloud geometry compression, context models usually use the one-hot encoding of node occupancy as the label.
We introduce the context feature residuals into the context model to amplify the differences between contexts.
We also add a multi-layer perception branch, that uses the mean squared error between its output and node occupancy as a loss function.
- Score: 17.88391386335647
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In point cloud geometry compression, context models usually use the one-hot encoding of node occupancy as the label, and the cross-entropy between the one-hot encoding and the probability distribution predicted by the context model as the loss function. However, this approach has two main weaknesses. First, the differences between contexts of different nodes are not significant, making it difficult for the context model to accurately predict the probability distribution of node occupancy. Second, as the one-hot encoding is not the actual probability distribution of node occupancy, the cross-entropy loss function is inaccurate. To address these problems, we propose a general structure that can enhance existing context models. We introduce the context feature residuals into the context model to amplify the differences between contexts. We also add a multi-layer perception branch, that uses the mean squared error between its output and node occupancy as a loss function to provide accurate gradients in backpropagation. We validate our method by showing that it can improve the performance of an octree-based model (OctAttention) and a voxel-based model (VoxelDNN) on the object point cloud datasets MPEG 8i and MVUB, as well as the LiDAR point cloud dataset SemanticKITTI.
Related papers
- SIGMA:Sinkhorn-Guided Masked Video Modeling [69.31715194419091]
Sinkhorn-guided Masked Video Modelling ( SIGMA) is a novel video pretraining method.
We distribute features of space-time tubes evenly across a limited number of learnable clusters.
Experimental results on ten datasets validate the effectiveness of SIGMA in learning more performant, temporally-aware, and robust video representations.
arXiv Detail & Related papers (2024-07-22T08:04:09Z) - Enhancing octree-based context models for point cloud geometry compression with attention-based child node number prediction [12.074555015414886]
In point cloud geometry compression, most octreebased context models use the cross-entropy between the onehot encoding of node occupancy and the probability distribution predicted by the context model as the loss.
We first analyze why the cross-entropy loss function fails to accurately measure the difference between the one-hot encoding and the predicted probability distribution.
We propose an attention-based child node number prediction (ACNP) module to enhance the context models.
arXiv Detail & Related papers (2024-07-11T14:16:41Z) - Towards Robust and Efficient Cloud-Edge Elastic Model Adaptation via Selective Entropy Distillation [56.79064699832383]
We establish a Cloud-Edge Elastic Model Adaptation (CEMA) paradigm in which the edge models only need to perform forward propagation.
In our CEMA, to reduce the communication burden, we devise two criteria to exclude unnecessary samples from uploading to the cloud.
arXiv Detail & Related papers (2024-02-27T08:47:19Z) - Unsupervised Deep Probabilistic Approach for Partial Point Cloud
Registration [74.53755415380171]
Deep point cloud registration methods face challenges to partial overlaps and rely on labeled data.
We propose UDPReg, an unsupervised deep probabilistic registration framework for point clouds with partial overlaps.
Our UDPReg achieves competitive performance on the 3DMatch/3DLoMatch and ModelNet/ModelLoNet benchmarks.
arXiv Detail & Related papers (2023-03-23T14:18:06Z) - Interpretations Steered Network Pruning via Amortized Inferred Saliency
Maps [85.49020931411825]
Convolutional Neural Networks (CNNs) compression is crucial to deploying these models in edge devices with limited resources.
We propose to address the channel pruning problem from a novel perspective by leveraging the interpretations of a model to steer the pruning process.
We tackle this challenge by introducing a selector model that predicts real-time smooth saliency masks for pruned models.
arXiv Detail & Related papers (2022-09-07T01:12:11Z) - OctAttention: Octree-based Large-scale Contexts Model for Point Cloud
Compression [36.77271904751208]
OctAttention employs the octree structure, a memory-efficient representation for point clouds.
Our approach saves 95% coding time compared to the voxel-based baseline.
Compared to the previous state-of-the-art works, our approach obtains a 10%-35% BD-Rate gain on the LiDAR benchmark.
arXiv Detail & Related papers (2022-02-12T10:06:12Z) - A Deep Latent Space Model for Graph Representation Learning [10.914558012458425]
We propose a Deep Latent Space Model (DLSM) for directed graphs to incorporate the traditional latent variable based generative model into deep learning frameworks.
Our proposed model consists of a graph convolutional network (GCN) encoder and a decoder, which are layer-wise connected by a hierarchical variational auto-encoder architecture.
Experiments on real-world datasets show that the proposed model achieves the state-of-the-art performances on both link prediction and community detection tasks.
arXiv Detail & Related papers (2021-06-22T12:41:19Z) - Towards Efficient Scene Understanding via Squeeze Reasoning [71.1139549949694]
We propose a novel framework called Squeeze Reasoning.
Instead of propagating information on the spatial map, we first learn to squeeze the input feature into a channel-wise global vector.
We show that our approach can be modularized as an end-to-end trained block and can be easily plugged into existing networks.
arXiv Detail & Related papers (2020-11-06T12:17:01Z) - Learning Context-Based Non-local Entropy Modeling for Image Compression [140.64888994506313]
In this paper, we propose a non-local operation for context modeling by employing the global similarity within the context.
The entropy model is further adopted as the rate loss in a joint rate-distortion optimization.
Considering that the width of the transforms is essential in training low distortion models, we finally produce a U-Net block in the transforms to increase the width with manageable memory consumption and time complexity.
arXiv Detail & Related papers (2020-05-10T13:28:18Z) - Residual Correlation in Graph Neural Network Regression [39.54530450932135]
We show that conditional independence assumption severely limits predictive power.
We address this problem with an interpretable and efficient framework.
Our framework achieves substantially higher accuracy than competing baselines.
arXiv Detail & Related papers (2020-02-19T16:32:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.