PAME: Self-Supervised Masked Autoencoder for No-Reference Point Cloud Quality Assessment
- URL: http://arxiv.org/abs/2403.10061v1
- Date: Fri, 15 Mar 2024 07:01:33 GMT
- Title: PAME: Self-Supervised Masked Autoencoder for No-Reference Point Cloud Quality Assessment
- Authors: Ziyu Shan, Yujie Zhang, Qi Yang, Haichen Yang, Yiling Xu, Shan Liu,
- Abstract summary: No-reference point cloud quality assessment (NR-PCQA) aims to automatically predict the perceptual quality of point clouds without reference.
We propose a self-supervised pre-training framework using masked autoencoders (PAME) to help the model learn useful representations without labels.
Our method outperforms the state-of-the-art NR-PCQA methods on popular benchmarks in terms of prediction accuracy and generalizability.
- Score: 34.256276774430575
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: No-reference point cloud quality assessment (NR-PCQA) aims to automatically predict the perceptual quality of point clouds without reference, which has achieved remarkable performance due to the utilization of deep learning-based models. However, these data-driven models suffer from the scarcity of labeled data and perform unsatisfactorily in cross-dataset evaluations. To address this problem, we propose a self-supervised pre-training framework using masked autoencoders (PAME) to help the model learn useful representations without labels. Specifically, after projecting point clouds into images, our PAME employs dual-branch autoencoders, reconstructing masked patches from distorted images into the original patches within reference and distorted images. In this manner, the two branches can separately learn content-aware features and distortion-aware features from the projected images. Furthermore, in the model fine-tuning stage, the learned content-aware features serve as a guide to fuse the point cloud quality features extracted from different perspectives. Extensive experiments show that our method outperforms the state-of-the-art NR-PCQA methods on popular benchmarks in terms of prediction accuracy and generalizability.
Related papers
- Learning Disentangled Representations for Perceptual Point Cloud Quality Assessment via Mutual Information Minimization [26.224150625323812]
No-Reference Point Cloud Quality Assessment (NR-PCQA) aims to objectively assess the human perceptual quality of point clouds.
We propose DisPA, a novel disentangled representation learning framework for NR-PCQA.
We show that DisPA outperforms state-of-the-art methods on multiple PCQA datasets.
arXiv Detail & Related papers (2024-11-12T17:05:18Z) - Attention Down-Sampling Transformer, Relative Ranking and Self-Consistency for Blind Image Quality Assessment [17.04649536069553]
No-reference image quality assessment is a challenging domain that addresses estimating image quality without the original reference.
We introduce an improved mechanism to extract local and non-local information from images via different transformer encoders and CNNs.
A self-consistency approach to self-supervision is presented, explicitly addressing the degradation of no-reference image quality assessment (NR-IQA) models.
arXiv Detail & Related papers (2024-09-11T09:08:43Z) - Bringing Masked Autoencoders Explicit Contrastive Properties for Point Cloud Self-Supervised Learning [116.75939193785143]
Contrastive learning (CL) for Vision Transformers (ViTs) in image domains has achieved performance comparable to CL for traditional convolutional backbones.
In 3D point cloud pretraining with ViTs, masked autoencoder (MAE) modeling remains dominant.
arXiv Detail & Related papers (2024-07-08T12:28:56Z) - Multi-Modal Prompt Learning on Blind Image Quality Assessment [65.0676908930946]
Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly.
Traditional methods, hindered by a lack of sufficiently annotated data, have employed the CLIP image-text pretraining model as their backbone to gain semantic awareness.
Recent approaches have attempted to address this mismatch using prompt technology, but these solutions have shortcomings.
This paper introduces an innovative multi-modal prompt-based methodology for IQA.
arXiv Detail & Related papers (2024-04-23T11:45:32Z) - Contrastive Pre-Training with Multi-View Fusion for No-Reference Point Cloud Quality Assessment [49.36799270585947]
No-reference point cloud quality assessment (NR-PCQA) aims to automatically evaluate the perceptual quality of distorted point clouds without available reference.
We propose a novel contrastive pre-training framework tailored for PCQA (CoPA)
Our method outperforms the state-of-the-art PCQA methods on popular benchmarks.
arXiv Detail & Related papers (2024-03-15T07:16:07Z) - Uncertainty-aware No-Reference Point Cloud Quality Assessment [25.543217625958462]
This work presents the first probabilistic architecture for no-reference point cloud quality assessment (PCQA)
The proposed method can model the quality judgingity of subjects through a tailored conditional variational autoencoder (AE)
Experiments indicate that our approach mimics previous cutting-edge methods by a large margin and exhibits cross-dataset experiments.
arXiv Detail & Related papers (2024-01-17T02:25:42Z) - Evaluating Point Cloud from Moving Camera Videos: A No-Reference Metric [58.309735075960745]
This paper explores the way of dealing with point cloud quality assessment (PCQA) tasks via video quality assessment (VQA) methods.
We generate the captured videos by rotating the camera around the point clouds through several circular pathways.
We extract both spatial and temporal quality-aware features from the selected key frames and the video clips through using trainable 2D-CNN and pre-trained 3D-CNN models.
arXiv Detail & Related papers (2022-08-30T08:59:41Z) - CONVIQT: Contrastive Video Quality Estimator [63.749184706461826]
Perceptual video quality assessment (VQA) is an integral component of many streaming and video sharing platforms.
Here we consider the problem of learning perceptually relevant video quality representations in a self-supervised manner.
Our results indicate that compelling representations with perceptual bearing can be obtained using self-supervised learning.
arXiv Detail & Related papers (2022-06-29T15:22:01Z) - Image Quality Assessment using Contrastive Learning [50.265638572116984]
We train a deep Convolutional Neural Network (CNN) using a contrastive pairwise objective to solve the auxiliary problem.
We show through extensive experiments that CONTRIQUE achieves competitive performance when compared to state-of-the-art NR image quality models.
Our results suggest that powerful quality representations with perceptual relevance can be obtained without requiring large labeled subjective image quality datasets.
arXiv Detail & Related papers (2021-10-25T21:01:00Z) - No-Reference Image Quality Assessment via Transformers, Relative
Ranking, and Self-Consistency [38.88541492121366]
The goal of No-Reference Image Quality Assessment (NR-IQA) is to estimate the perceptual image quality in accordance with subjective evaluations.
We propose a novel model to address the NR-IQA task by leveraging a hybrid approach that benefits from Convolutional Neural Networks (CNNs) and self-attention mechanism in Transformers.
arXiv Detail & Related papers (2021-08-16T02:07:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.