SEDD-PCC: A Single Encoder-Dual Decoder Framework For End-To-End Learned Point Cloud Compression
- URL: http://arxiv.org/abs/2505.16709v1
- Date: Thu, 22 May 2025 14:11:24 GMT
- Title: SEDD-PCC: A Single Encoder-Dual Decoder Framework For End-To-End Learned Point Cloud Compression
- Authors: Kai Hsiang Hsieh, Monyneath Yim, Jui Chiu Chiang,
- Abstract summary: We propose SEDD-PCC, an end-to-end learning-based framework for lossy point cloud compression.<n>We employ a single encoder to extract shared geometric and attribute features into a unified latent space, followed by dual specialized decoders that sequentially reconstruct geometry and attributes.<n>With its simple yet effective design, SEDD-PCC provides an efficient and practical solution for point cloud compression.
- Score: 2.1902373533152346
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To encode point clouds containing both geometry and attributes, most learning-based compression schemes treat geometry and attribute coding separately, employing distinct encoders and decoders. This not only increases computational complexity but also fails to fully exploit shared features between geometry and attributes. To address this limitation, we propose SEDD-PCC, an end-to-end learning-based framework for lossy point cloud compression that jointly compresses geometry and attributes. SEDD-PCC employs a single encoder to extract shared geometric and attribute features into a unified latent space, followed by dual specialized decoders that sequentially reconstruct geometry and attributes. Additionally, we incorporate knowledge distillation to enhance feature representation learning from a teacher model, further improving coding efficiency. With its simple yet effective design, SEDD-PCC provides an efficient and practical solution for point cloud compression. Comparative evaluations against both rule-based and learning-based methods demonstrate its competitive performance, highlighting SEDD-PCC as a promising AI-driven compression approach.
Related papers
- LaCo: Efficient Layer-wise Compression of Visual Tokens for Multimodal Large Language Models [62.240460476785934]
We propose LaCo (Layer-wise Visual Token Compression), a novel framework that enables effective token compression within the intermediate layers of the vision encoder.<n>LaCo introduces two core components: 1) a layer-wise pixel-shuffle mechanism that systematically merges adjacent tokens through space-to-channel transformations, and 2) a residual learning architecture with non-parametric shortcuts.
arXiv Detail & Related papers (2025-07-03T03:42:54Z) - Semi-supervised Semantic Segmentation with Multi-Constraint Consistency Learning [81.02648336552421]
We propose a Multi-Constraint Consistency Learning approach to facilitate the staged enhancement of the encoder and decoder.<n>Self-adaptive feature masking and noise injection are designed in an instance-specific manner to perturb the features for robust learning of the decoder.<n> Experimental results on Pascal VOC2012 and Cityscapes datasets demonstrate that our proposed MCCL achieves new state-of-the-art performance.
arXiv Detail & Related papers (2025-03-23T03:21:33Z) - Deep-JGAC: End-to-End Deep Joint Geometry and Attribute Compression for Dense Colored Point Clouds [32.891169081810574]
We propose an end-to-end Deep Joint Geometry and Attribute point cloud Compression framework.<n>It exploits the correlation between the geometry and attribute for high compression efficiency.<n>The proposed Deep-JGAC achieves an average of 82.96%, 36.46%, 41.72%, and 31.16% bit-rate reductions.
arXiv Detail & Related papers (2025-02-25T08:01:57Z) - Att2CPC: Attention-Guided Lossy Attribute Compression of Point Clouds [18.244200436103156]
We propose an efficient attention-based method for lossy compression of point cloud attributes leveraging on an autoencoder architecture.
Experiments show that our method achieves an average improvement of 1.15 dB and 2.13 dB in BD-PSNR of Y channel and YUV channel, respectively.
arXiv Detail & Related papers (2024-10-23T12:32:21Z) - Point Cloud Compression with Bits-back Coding [32.9521748764196]
This paper specializes in using a deep learning-based probabilistic model to estimate the Shannon's entropy of the point cloud information.
Once the entropy of the point cloud dataset is estimated, we use the learned CVAE model to compress the geometric attributes of the point clouds.
The novelty of our method with bits-back coding specializes in utilizing the learned latent variable model of the CVAE to compress the point cloud data.
arXiv Detail & Related papers (2024-10-09T06:34:48Z) - The JPEG Pleno Learning-based Point Cloud Coding Standard: Serving Man and Machine [49.16996486119006]
Deep learning has emerged as a powerful tool in point cloud coding.<n> JPEG has recently finalized the JPEG Pleno Learning-based Point Cloud Coding standard.<n>This paper provides a complete technical description of the JPEG PCC standard.
arXiv Detail & Related papers (2024-09-12T15:20:23Z) - Learned Compression of Point Cloud Geometry and Attributes in a Single Model through Multimodal Rate-Control [2.7077560296908416]
We learn joint compression of geometry and attributes using a single, adaptive autoencoder model.
Our evaluation shows comparable performance to state-of-the-art compression methods for geometry and attributes.
arXiv Detail & Related papers (2024-08-01T14:31:06Z) - Geometric Prior Based Deep Human Point Cloud Geometry Compression [67.49785946369055]
We leverage the human geometric prior in geometry redundancy removal of point clouds.
We can envisage high-resolution human point clouds as a combination of geometric priors and structural deviations.
The proposed framework can operate in a play-and-plug fashion with existing learning based point cloud compression methods.
arXiv Detail & Related papers (2023-05-02T10:35:20Z) - Quick Dense Retrievers Consume KALE: Post Training Kullback Leibler
Alignment of Embeddings for Asymmetrical dual encoders [89.29256833403169]
We introduce Kullback Leibler Alignment of Embeddings (KALE), an efficient and accurate method for increasing the inference efficiency of dense retrieval methods.
KALE extends traditional Knowledge Distillation after bi-encoder training, allowing for effective query encoder compression without full retraining or index generation.
Using KALE and asymmetric training, we can generate models which exceed the performance of DistilBERT despite having 3x faster inference.
arXiv Detail & Related papers (2023-03-31T15:44:13Z) - SoftPool++: An Encoder-Decoder Network for Point Cloud Completion [93.54286830844134]
We propose a novel convolutional operator for the task of point cloud completion.
The proposed operator does not require any max-pooling or voxelization operation.
We show that our approach achieves state-of-the-art performance in shape completion at low and high resolutions.
arXiv Detail & Related papers (2022-05-08T15:31:36Z) - Multiscale Point Cloud Geometry Compression [29.605320327889142]
We propose a multiscale-to-end learning framework which hierarchically reconstructs the 3D Point Cloud Geometry.
The framework is developed on top of a sparse convolution based autoencoder for point cloud compression and reconstruction.
arXiv Detail & Related papers (2020-11-07T16:11:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.