PIT-QMM: A Large Multimodal Model For No-Reference Point Cloud Quality Assessment
- URL: http://arxiv.org/abs/2510.07636v1
- Date: Thu, 09 Oct 2025 00:13:34 GMT
- Title: PIT-QMM: A Large Multimodal Model For No-Reference Point Cloud Quality Assessment
- Authors: Shashank Gupta, Gregoire Phillips, Alan C. Bovik,
- Abstract summary: Large Multimodal Models (LMMs) have recently enabled considerable advances in the realm of image and video quality assessment.<n>We are interested in using these models to conduct No-Reference Point Cloud Quality Assessment (NR-PCQA)<n>The aim is to automatically evaluate the perceptual quality of a point cloud in absence of a reference.<n>We construct PIT-QMM, a novel LMM for NR-PCQA that is capable of consuming text, images and point clouds end-to-end to predict quality scores.
- Score: 26.896426878221718
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Multimodal Models (LMMs) have recently enabled considerable advances in the realm of image and video quality assessment, but this progress has yet to be fully explored in the domain of 3D assets. We are interested in using these models to conduct No-Reference Point Cloud Quality Assessment (NR-PCQA), where the aim is to automatically evaluate the perceptual quality of a point cloud in absence of a reference. We begin with the observation that different modalities of data - text descriptions, 2D projections, and 3D point cloud views - provide complementary information about point cloud quality. We then construct PIT-QMM, a novel LMM for NR-PCQA that is capable of consuming text, images and point clouds end-to-end to predict quality scores. Extensive experimentation shows that our proposed method outperforms the state-of-the-art by significant margins on popular benchmarks with fewer training iterations. We also demonstrate that our framework enables distortion localization and identification, which paves a new way forward for model explainability and interactivity. Code and datasets are available at https://www.github.com/shngt/pit-qmm.
Related papers
- No-Reference Point Cloud Quality Assessment via Graph Convolutional Network [89.12589881881082]
Three-dimensional (3D) point cloud, as an emerging visual media format, is increasingly favored by consumers.
Point clouds inevitably suffer from quality degradation and information loss through multimedia communication systems.
We propose a novel no-reference PCQA method by using a graph convolutional network (GCN) to characterize the mutual dependencies of multi-view 2D projected image contents.
arXiv Detail & Related papers (2024-11-12T11:39:05Z) - Q-Ground: Image Quality Grounding with Large Multi-modality Models [61.72022069880346]
We introduce Q-Ground, the first framework aimed at tackling fine-scale visual quality grounding.
Q-Ground combines large multi-modality models with detailed visual quality analysis.
Central to our contribution is the introduction of the QGround-100K dataset.
arXiv Detail & Related papers (2024-07-24T06:42:46Z) - Contrastive Pre-Training with Multi-View Fusion for No-Reference Point Cloud Quality Assessment [49.36799270585947]
No-reference point cloud quality assessment (NR-PCQA) aims to automatically evaluate the perceptual quality of distorted point clouds without available reference.
We propose a novel contrastive pre-training framework tailored for PCQA (CoPA)
Our method outperforms the state-of-the-art PCQA methods on popular benchmarks.
arXiv Detail & Related papers (2024-03-15T07:16:07Z) - PAME: Self-Supervised Masked Autoencoder for No-Reference Point Cloud Quality Assessment [34.256276774430575]
No-reference point cloud quality assessment (NR-PCQA) aims to automatically predict the perceptual quality of point clouds without reference.
We propose a self-supervised pre-training framework using masked autoencoders (PAME) to help the model learn useful representations without labels.
Our method outperforms the state-of-the-art NR-PCQA methods on popular benchmarks in terms of prediction accuracy and generalizability.
arXiv Detail & Related papers (2024-03-15T07:01:33Z) - Simple Baselines for Projection-based Full-reference and No-reference
Point Cloud Quality Assessment [60.2709006613171]
We propose simple baselines for projection-based point cloud quality assessment (PCQA)
We use multi-projections obtained via a common cube-like projection process from the point clouds for both full-reference (FR) and no-reference (NR) PCQA tasks.
Taking part in the ICIP 2023 PCVQA Challenge, we succeeded in achieving the top spot in four out of the five competition tracks.
arXiv Detail & Related papers (2023-10-26T04:42:57Z) - Reduced-Reference Quality Assessment of Point Clouds via
Content-Oriented Saliency Projection [17.983188216548005]
Many dense 3D point clouds have been exploited to represent visual objects instead of traditional images or videos.
We propose a novel and efficient Reduced-Reference quality metric for point clouds.
arXiv Detail & Related papers (2023-01-18T18:00:29Z) - MM-PCQA: Multi-Modal Learning for No-reference Point Cloud Quality
Assessment [32.495387943305204]
We propose a novel no-reference point cloud quality assessment (NR-PCQA) metric in a multi-modal fashion.
In specific, we split the point clouds into sub-models to represent local geometry distortions such as point shift and down-sampling.
To achieve the goals, the sub-models and projected images are encoded with point-based and image-based neural networks.
arXiv Detail & Related papers (2022-09-01T06:11:12Z) - Blind Quality Assessment of 3D Dense Point Clouds with Structure Guided
Resampling [71.68672977990403]
We propose an objective point cloud quality index with Structure Guided Resampling (SGR) to automatically evaluate the perceptually visual quality of 3D dense point clouds.
The proposed SGR is a general-purpose blind quality assessment method without the assistance of any reference information.
arXiv Detail & Related papers (2022-08-31T02:42:55Z) - Evaluating Point Cloud from Moving Camera Videos: A No-Reference Metric [58.309735075960745]
This paper explores the way of dealing with point cloud quality assessment (PCQA) tasks via video quality assessment (VQA) methods.
We generate the captured videos by rotating the camera around the point clouds through several circular pathways.
We extract both spatial and temporal quality-aware features from the selected key frames and the video clips through using trainable 2D-CNN and pre-trained 3D-CNN models.
arXiv Detail & Related papers (2022-08-30T08:59:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.