PAME: Self-Supervised Masked Autoencoder for No-Reference Point Cloud Quality Assessment
- URL: http://arxiv.org/abs/2403.10061v1
- Date: Fri, 15 Mar 2024 07:01:33 GMT
- Title: PAME: Self-Supervised Masked Autoencoder for No-Reference Point Cloud Quality Assessment
- Authors: Ziyu Shan, Yujie Zhang, Qi Yang, Haichen Yang, Yiling Xu, Shan Liu,
- Abstract summary: No-reference point cloud quality assessment (NR-PCQA) aims to automatically predict the perceptual quality of point clouds without reference.
We propose a self-supervised pre-training framework using masked autoencoders (PAME) to help the model learn useful representations without labels.
Our method outperforms the state-of-the-art NR-PCQA methods on popular benchmarks in terms of prediction accuracy and generalizability.
- Score: 34.256276774430575
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: No-reference point cloud quality assessment (NR-PCQA) aims to automatically predict the perceptual quality of point clouds without reference, which has achieved remarkable performance due to the utilization of deep learning-based models. However, these data-driven models suffer from the scarcity of labeled data and perform unsatisfactorily in cross-dataset evaluations. To address this problem, we propose a self-supervised pre-training framework using masked autoencoders (PAME) to help the model learn useful representations without labels. Specifically, after projecting point clouds into images, our PAME employs dual-branch autoencoders, reconstructing masked patches from distorted images into the original patches within reference and distorted images. In this manner, the two branches can separately learn content-aware features and distortion-aware features from the projected images. Furthermore, in the model fine-tuning stage, the learned content-aware features serve as a guide to fuse the point cloud quality features extracted from different perspectives. Extensive experiments show that our method outperforms the state-of-the-art NR-PCQA methods on popular benchmarks in terms of prediction accuracy and generalizability.
Related papers
- From Images to Point Clouds: An Efficient Solution for Cross-media Blind Quality Assessment without Annotated Training [35.45364402708792]
We present a novel quality assessment method which can predict the perceptual quality of point clouds from new scenes without available annotations.
Recognizing the human visual system (HVS) as the decision-maker in quality assessment regardless of media types, we can emulate the evaluation criteria for human perception via neural networks.
We propose the distortion-guided biased feature alignment which integrates existing/estimated distortion distribution into the adversarial DA framework.
arXiv Detail & Related papers (2025-01-23T05:15:10Z) - Learning Disentangled Representations for Perceptual Point Cloud Quality Assessment via Mutual Information Minimization [26.224150625323812]
No-Reference Point Cloud Quality Assessment (NR-PCQA) aims to objectively assess the human perceptual quality of point clouds.
We propose DisPA, a novel disentangled representation learning framework for NR-PCQA.
We show that DisPA outperforms state-of-the-art methods on multiple PCQA datasets.
arXiv Detail & Related papers (2024-11-12T17:05:18Z) - Bringing Masked Autoencoders Explicit Contrastive Properties for Point Cloud Self-Supervised Learning [116.75939193785143]
Contrastive learning (CL) for Vision Transformers (ViTs) in image domains has achieved performance comparable to CL for traditional convolutional backbones.
In 3D point cloud pretraining with ViTs, masked autoencoder (MAE) modeling remains dominant.
arXiv Detail & Related papers (2024-07-08T12:28:56Z) - Multi-Modal Prompt Learning on Blind Image Quality Assessment [65.0676908930946]
Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly.
Traditional methods, hindered by a lack of sufficiently annotated data, have employed the CLIP image-text pretraining model as their backbone to gain semantic awareness.
Recent approaches have attempted to address this mismatch using prompt technology, but these solutions have shortcomings.
This paper introduces an innovative multi-modal prompt-based methodology for IQA.
arXiv Detail & Related papers (2024-04-23T11:45:32Z) - Contrastive Pre-Training with Multi-View Fusion for No-Reference Point Cloud Quality Assessment [49.36799270585947]
No-reference point cloud quality assessment (NR-PCQA) aims to automatically evaluate the perceptual quality of distorted point clouds without available reference.
We propose a novel contrastive pre-training framework tailored for PCQA (CoPA)
Our method outperforms the state-of-the-art PCQA methods on popular benchmarks.
arXiv Detail & Related papers (2024-03-15T07:16:07Z) - Uncertainty-aware No-Reference Point Cloud Quality Assessment [25.543217625958462]
This work presents the first probabilistic architecture for no-reference point cloud quality assessment (PCQA)
The proposed method can model the quality judgingity of subjects through a tailored conditional variational autoencoder (AE)
Experiments indicate that our approach mimics previous cutting-edge methods by a large margin and exhibits cross-dataset experiments.
arXiv Detail & Related papers (2024-01-17T02:25:42Z) - Evaluating Point Cloud from Moving Camera Videos: A No-Reference Metric [58.309735075960745]
This paper explores the way of dealing with point cloud quality assessment (PCQA) tasks via video quality assessment (VQA) methods.
We generate the captured videos by rotating the camera around the point clouds through several circular pathways.
We extract both spatial and temporal quality-aware features from the selected key frames and the video clips through using trainable 2D-CNN and pre-trained 3D-CNN models.
arXiv Detail & Related papers (2022-08-30T08:59:41Z) - CONVIQT: Contrastive Video Quality Estimator [63.749184706461826]
Perceptual video quality assessment (VQA) is an integral component of many streaming and video sharing platforms.
Here we consider the problem of learning perceptually relevant video quality representations in a self-supervised manner.
Our results indicate that compelling representations with perceptual bearing can be obtained using self-supervised learning.
arXiv Detail & Related papers (2022-06-29T15:22:01Z) - Image Quality Assessment using Contrastive Learning [50.265638572116984]
We train a deep Convolutional Neural Network (CNN) using a contrastive pairwise objective to solve the auxiliary problem.
We show through extensive experiments that CONTRIQUE achieves competitive performance when compared to state-of-the-art NR image quality models.
Our results suggest that powerful quality representations with perceptual relevance can be obtained without requiring large labeled subjective image quality datasets.
arXiv Detail & Related papers (2021-10-25T21:01:00Z) - No-Reference Image Quality Assessment via Transformers, Relative
Ranking, and Self-Consistency [38.88541492121366]
The goal of No-Reference Image Quality Assessment (NR-IQA) is to estimate the perceptual image quality in accordance with subjective evaluations.
We propose a novel model to address the NR-IQA task by leveraging a hybrid approach that benefits from Convolutional Neural Networks (CNNs) and self-attention mechanism in Transformers.
arXiv Detail & Related papers (2021-08-16T02:07:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.