ROI-Guided Point Cloud Geometry Compression Towards Human and Machine Vision
- URL: http://arxiv.org/abs/2504.14240v1
- Date: Sat, 19 Apr 2025 09:31:37 GMT
- Title: ROI-Guided Point Cloud Geometry Compression Towards Human and Machine Vision
- Authors: Xie Liang, Gao Wei, Zhenghui Ming, Li Ge,
- Abstract summary: We introduce a novel Region of Interest-guided Point Cloud Geometry Compression (RPCGC) method for human and machine vision.<n>Our framework employs a dual-branch parallel structure, where the base layer encodes and decodes a simplified version of the point cloud.<n>We intricately apply these mask details in the Rate-Distortion (RD) optimization process, with each point weighted in the distortion calculation.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Point cloud data is pivotal in applications like autonomous driving, virtual reality, and robotics. However, its substantial volume poses significant challenges in storage and transmission. In order to obtain a high compression ratio, crucial semantic details usually confront severe damage, leading to difficulties in guaranteeing the accuracy of downstream tasks. To tackle this problem, we are the first to introduce a novel Region of Interest (ROI)-guided Point Cloud Geometry Compression (RPCGC) method for human and machine vision. Our framework employs a dual-branch parallel structure, where the base layer encodes and decodes a simplified version of the point cloud, and the enhancement layer refines this by focusing on geometry details. Furthermore, the residual information of the enhancement layer undergoes refinement through an ROI prediction network. This network generates mask information, which is then incorporated into the residuals, serving as a strong supervision signal. Additionally, we intricately apply these mask details in the Rate-Distortion (RD) optimization process, with each point weighted in the distortion calculation. Our loss function includes RD loss and detection loss to better guide point cloud encoding for the machine. Experiment results demonstrate that RPCGC achieves exceptional compression performance and better detection accuracy (10% gain) than some learning-based compression methods at high bitrates in ScanNet and SUN RGB-D datasets.
Related papers
- PCE-GAN: A Generative Adversarial Network for Point Cloud Attribute Quality Enhancement based on Optimal Transport [56.56430888985025]
We propose a generative adversarial network for point cloud quality enhancement (PCE-GAN)
The generator consists of a local feature extraction (LFE) unit, a global spatial correlation (GSC) unit and a feature squeeze unit.
The discriminator computes the deviation between the probability distributions of the enhanced point cloud and the original point cloud, guiding the generator to achieve high quality reconstruction.
arXiv Detail & Related papers (2025-02-26T07:34:33Z) - Rendering-Oriented 3D Point Cloud Attribute Compression using Sparse Tensor-based Transformer [52.40992954884257]
3D visualization techniques have fundamentally transformed how we interact with digital content.
Massive data size of point clouds presents significant challenges in data compression.
We propose an end-to-end deep learning framework that seamlessly integrates PCAC with differentiable rendering.
arXiv Detail & Related papers (2024-11-12T16:12:51Z) - Decoupling Fine Detail and Global Geometry for Compressed Depth Map Super-Resolution [55.9977636042469]
We propose a novel framework, termed geometry-decoupled network (GDNet), for compressed depth map super-resolution.<n>It decouples the high-quality depth map reconstruction process by handling global and detailed geometric features separately.<n>Our solution significantly outperforms current methods in terms of geometric consistency and detail recovery.
arXiv Detail & Related papers (2024-11-05T16:37:30Z) - Att2CPC: Attention-Guided Lossy Attribute Compression of Point Clouds [18.244200436103156]
We propose an efficient attention-based method for lossy compression of point cloud attributes leveraging on an autoencoder architecture.
Experiments show that our method achieves an average improvement of 1.15 dB and 2.13 dB in BD-PSNR of Y channel and YUV channel, respectively.
arXiv Detail & Related papers (2024-10-23T12:32:21Z) - Convolutional variational autoencoders for secure lossy image compression in remote sensing [47.75904906342974]
This study investigates image compression based on convolutional variational autoencoders (CVAE)
CVAEs have been demonstrated to outperform conventional compression methods such as JPEG2000 by a substantial margin on compression benchmark datasets.
arXiv Detail & Related papers (2024-04-03T15:17:29Z) - 3D Point Cloud Compression with Recurrent Neural Network and Image
Compression Methods [0.0]
Storing and transmitting LiDAR point cloud data is essential for many AV applications.
Due to the sparsity and unordered structure of the data, it is difficult to compress point cloud data to a low volume.
We propose a new 3D-to-2D transformation which allows compression algorithms to efficiently exploit spatial correlations.
arXiv Detail & Related papers (2024-02-18T19:08:19Z) - Compression of Structured Data with Autoencoders: Provable Benefit of
Nonlinearities and Depth [83.15263499262824]
We prove that gradient descent converges to a solution that completely disregards the sparse structure of the input.
We show how to improve upon Gaussian performance for the compression of sparse data by adding a denoising function to a shallow architecture.
We validate our findings on image datasets, such as CIFAR-10 and MNIST.
arXiv Detail & Related papers (2024-02-07T16:32:29Z) - Depth Monocular Estimation with Attention-based Encoder-Decoder Network
from Single Image [7.753378095194288]
Vision-based approaches have recently received much attention and can overcome these drawbacks.
In this work, we explore an extreme scenario in vision-based settings: estimate a depth map from one monocular image severely plagued by grid artifacts and blurry edges.
Our novel approach can find the focus of current image with minimal overhead and avoid losses of depth features.
arXiv Detail & Related papers (2022-10-24T23:01:25Z) - Reducing Redundancy in the Bottleneck Representation of the Autoencoders [98.78384185493624]
Autoencoders are a type of unsupervised neural networks, which can be used to solve various tasks.
We propose a scheme to explicitly penalize feature redundancies in the bottleneck representation.
We tested our approach across different tasks: dimensionality reduction using three different dataset, image compression using the MNIST dataset, and image denoising using fashion MNIST.
arXiv Detail & Related papers (2022-02-09T18:48:02Z) - GRNet: Gridding Residual Network for Dense Point Cloud Completion [54.43648460932248]
Estimating the complete 3D point cloud from an incomplete one is a key problem in many vision and robotics applications.
We propose a novel Gridding Residual Network (GRNet) for point cloud completion.
Experimental results indicate that the proposed GRNet performs favorably against state-of-the-art methods on the ShapeNet, Completion3D, and KITTI benchmarks.
arXiv Detail & Related papers (2020-06-06T02:46:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.