Related papers: Learning Disentangled Representations for Perceptual Point Cloud Quality Assessment via Mutual Information Minimization

Learning Disentangled Representations for Perceptual Point Cloud Quality Assessment via Mutual Information Minimization

URL: http://arxiv.org/abs/2411.07936v1
Date: Tue, 12 Nov 2024 17:05:18 GMT
Title: Learning Disentangled Representations for Perceptual Point Cloud Quality Assessment via Mutual Information Minimization
Authors: Ziyu Shan, Yujie Zhang, Yipeng Liu, Yiling Xu,
Abstract summary: No-Reference Point Cloud Quality Assessment (NR-PCQA) aims to objectively assess the human perceptual quality of point clouds. We propose DisPA, a novel disentangled representation learning framework for NR-PCQA. We show that DisPA outperforms state-of-the-art methods on multiple PCQA datasets.
Score: 26.224150625323812
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: No-Reference Point Cloud Quality Assessment (NR-PCQA) aims to objectively assess the human perceptual quality of point clouds without relying on pristine-quality point clouds for reference. It is becoming increasingly significant with the rapid advancement of immersive media applications such as virtual reality (VR) and augmented reality (AR). However, current NR-PCQA models attempt to indiscriminately learn point cloud content and distortion representations within a single network, overlooking their distinct contributions to quality information. To address this issue, we propose DisPA, a novel disentangled representation learning framework for NR-PCQA. The framework trains a dual-branch disentanglement network to minimize mutual information (MI) between representations of point cloud content and distortion. Specifically, to fully disentangle representations, the two branches adopt different philosophies: the content-aware encoder is pretrained by a masked auto-encoding strategy, which can allow the encoder to capture semantic information from rendered images of distorted point clouds; the distortion-aware encoder takes a mini-patch map as input, which forces the encoder to focus on low-level distortion patterns. Furthermore, we utilize an MI estimator to estimate the tight upper bound of the actual MI and further minimize it to achieve explicit representation disentanglement. Extensive experimental results demonstrate that DisPA outperforms state-of-the-art methods on multiple PCQA datasets.

Related papers

RBFIM: Perceptual Quality Assessment for Compressed Point Clouds Using Radial Basis Function Interpolation [58.04300937361664]
One of the main challenges in point cloud compression (PCC) is how to evaluate the perceived distortion so that the RB can be optimized for perceptual quality. We propose a novel assessment method, utilizing radial basis function (RBF) to convert discrete point features into a continuous feature function for the distorted point cloud.
arXiv Detail & Related papers (2025-03-18T11:25:55Z)
Rendering-Oriented 3D Point Cloud Attribute Compression using Sparse Tensor-based Transformer [52.40992954884257]
3D visualization techniques have fundamentally transformed how we interact with digital content. Massive data size of point clouds presents significant challenges in data compression. We propose an end-to-end deep learning framework that seamlessly integrates PCAC with differentiable rendering.
arXiv Detail & Related papers (2024-11-12T16:12:51Z)
Contrastive Pre-Training with Multi-View Fusion for No-Reference Point Cloud Quality Assessment [49.36799270585947]
No-reference point cloud quality assessment (NR-PCQA) aims to automatically evaluate the perceptual quality of distorted point clouds without available reference. We propose a novel contrastive pre-training framework tailored for PCQA (CoPA) Our method outperforms the state-of-the-art PCQA methods on popular benchmarks.
arXiv Detail & Related papers (2024-03-15T07:16:07Z)
PAME: Self-Supervised Masked Autoencoder for No-Reference Point Cloud Quality Assessment [34.256276774430575]
No-reference point cloud quality assessment (NR-PCQA) aims to automatically predict the perceptual quality of point clouds without reference. We propose a self-supervised pre-training framework using masked autoencoders (PAME) to help the model learn useful representations without labels. Our method outperforms the state-of-the-art NR-PCQA methods on popular benchmarks in terms of prediction accuracy and generalizability.
arXiv Detail & Related papers (2024-03-15T07:01:33Z)
Simple Baselines for Projection-based Full-reference and No-reference Point Cloud Quality Assessment [60.2709006613171]
We propose simple baselines for projection-based point cloud quality assessment (PCQA) We use multi-projections obtained via a common cube-like projection process from the point clouds for both full-reference (FR) and no-reference (NR) PCQA tasks. Taking part in the ICIP 2023 PCVQA Challenge, we succeeded in achieving the top spot in four out of the five competition tracks.
arXiv Detail & Related papers (2023-10-26T04:42:57Z)
TCDM: Transformational Complexity Based Distortion Metric for Perceptual Point Cloud Quality Assessment [24.936061591860838]
The goal of objective point cloud quality assessment (PCQA) research is to develop metrics that measure point cloud quality in a consistent manner. We evaluate the point cloud quality by measuring the complexity of transforming the distorted point cloud back to its reference. The effectiveness of the proposed transformational complexity based distortion metric (TCDM) is evaluated through extensive experiments conducted on five public point cloud quality assessment databases.
arXiv Detail & Related papers (2022-10-10T13:20:51Z)
MAPLE: Masked Pseudo-Labeling autoEncoder for Semi-supervised Point Cloud Action Recognition [160.49403075559158]
We propose a Masked Pseudo-Labeling autoEncoder (textbfMAPLE) framework for point cloud action recognition. In particular, we design a novel and efficient textbfDecoupled textbfspatial-textbftemporal TranstextbfFormer (textbfDestFormer) as the backbone of MAPLE. MAPLE achieves superior results on three public benchmarks and outperforms the state-of-the-art method by 8.08% accuracy on the MSR-Action3
arXiv Detail & Related papers (2022-09-01T12:32:40Z)
MM-PCQA: Multi-Modal Learning for No-reference Point Cloud Quality Assessment [32.495387943305204]
We propose a novel no-reference point cloud quality assessment (NR-PCQA) metric in a multi-modal fashion. In specific, we split the point clouds into sub-models to represent local geometry distortions such as point shift and down-sampling. To achieve the goals, the sub-models and projected images are encoded with point-based and image-based neural networks.
arXiv Detail & Related papers (2022-09-01T06:11:12Z)
Reducing Redundancy in the Bottleneck Representation of the Autoencoders [98.78384185493624]
Autoencoders are a type of unsupervised neural networks, which can be used to solve various tasks. We propose a scheme to explicitly penalize feature redundancies in the bottleneck representation. We tested our approach across different tasks: dimensionality reduction using three different dataset, image compression using the MNIST dataset, and image denoising using fashion MNIST.
arXiv Detail & Related papers (2022-02-09T18:48:02Z)
Reduced Reference Perceptual Quality Model and Application to Rate Control for 3D Point Cloud Compression [61.110938359555895]
In rate-distortion optimization, the encoder settings are determined by maximizing a reconstruction quality measure subject to a constraint on the bit rate. We propose a linear perceptual quality model whose variables are the V-PCC geometry and color quantization parameters. Subjective quality tests with 400 compressed 3D point clouds show that the proposed model correlates well with the mean opinion score. We show that for the same target bit rate, ratedistortion optimization based on the proposed model offers higher perceptual quality than rate-distortion optimization based on exhaustive search with a point-to-point objective quality metric.
arXiv Detail & Related papers (2020-11-25T12:42:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.