Related papers: CLIP-PCQA: Exploring Subjective-Aligned Vision-Language Modeling for Point Cloud Quality Assessment

CLIP-PCQA: Exploring Subjective-Aligned Vision-Language Modeling for Point Cloud Quality Assessment

URL: http://arxiv.org/abs/2501.10071v1
Date: Fri, 17 Jan 2025 09:43:14 GMT
Title: CLIP-PCQA: Exploring Subjective-Aligned Vision-Language Modeling for Point Cloud Quality Assessment
Authors: Yating Liu, Yujie Zhang, Ziyu Shan, Yiling Xu,
Abstract summary: We propose a novel language-driven PCQA method named CLIP-PCQA.<n>Considering that human beings prefer to describe visual quality using discrete quality descriptions, we adopt a retrieval-based mapping strategy.<n>We show that our CLIP-PCQA outperforms other State-Of-The-Art (SOTA) approaches.
Score: 21.9149920194746
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In recent years, No-Reference Point Cloud Quality Assessment (NR-PCQA) research has achieved significant progress. However, existing methods mostly seek a direct mapping function from visual data to the Mean Opinion Score (MOS), which is contradictory to the mechanism of practical subjective evaluation. To address this, we propose a novel language-driven PCQA method named CLIP-PCQA. Considering that human beings prefer to describe visual quality using discrete quality descriptions (e.g., "excellent" and "poor") rather than specific scores, we adopt a retrieval-based mapping strategy to simulate the process of subjective assessment. More specifically, based on the philosophy of CLIP, we calculate the cosine similarity between the visual features and multiple textual features corresponding to different quality descriptions, in which process an effective contrastive loss and learnable prompts are introduced to enhance the feature extraction. Meanwhile, given the personal limitations and bias in subjective experiments, we further covert the feature similarities into probabilities and consider the Opinion Score Distribution (OSD) rather than a single MOS as the final target. Experimental results show that our CLIP-PCQA outperforms other State-Of-The-Art (SOTA) approaches.

Related papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment [0.9821874476902972]
Multimodal methods based on vision-language models, such as CLIP, have demonstrated exceptional generalization capabilities in IQA tasks.<n>This study proposes a visual-language model knowledge distillation method aimed at guiding the training of models with architectural advantages using CLIP's IQA knowledge.
arXiv Detail & Related papers (2025-07-21T14:44:46Z)
Low-Complexity Patch-based No-Reference Point Cloud Quality Metric exploiting Weighted Structure and Texture Features [5.409704301731714]
PST-PCQA is a no-reference point cloud quality metric based on a low-complexity, learning-based framework. It evaluates point cloud quality by analyzing individual patches, integrating local and global features to predict the Mean Opinion Score.
arXiv Detail & Related papers (2025-03-19T08:52:04Z)
Self-Training with Pseudo-Label Scorer for Aspect Sentiment Quad Prediction [54.23208041792073]
Aspect Sentiment Quad Prediction (ASQP) aims to predict all quads (aspect term, aspect category, opinion term, sentiment polarity) for a given review. A key challenge in the ASQP task is the scarcity of labeled data, which limits the performance of existing methods. We propose a self-training framework with a pseudo-label scorer, wherein a scorer assesses the match between reviews and their pseudo-labels.
arXiv Detail & Related papers (2024-06-26T05:30:21Z)
Multi-Modal Prompt Learning on Blind Image Quality Assessment [65.0676908930946]
Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly. Traditional methods, hindered by a lack of sufficiently annotated data, have employed the CLIP image-text pretraining model as their backbone to gain semantic awareness. Recent approaches have attempted to address this mismatch using prompt technology, but these solutions have shortcomings. This paper introduces an innovative multi-modal prompt-based methodology for IQA.
arXiv Detail & Related papers (2024-04-23T11:45:32Z)
Evaluating Generative Language Models in Information Extraction as Subjective Question Correction [49.729908337372436]
We propose a new evaluation method, SQC-Score. Inspired by the principles in subjective question correction, we propose a new evaluation method, SQC-Score. Results on three information extraction tasks show that SQC-Score is more preferred by human annotators than the baseline metrics.
arXiv Detail & Related papers (2024-04-04T15:36:53Z)
Contrastive Pre-Training with Multi-View Fusion for No-Reference Point Cloud Quality Assessment [49.36799270585947]
No-reference point cloud quality assessment (NR-PCQA) aims to automatically evaluate the perceptual quality of distorted point clouds without available reference. We propose a novel contrastive pre-training framework tailored for PCQA (CoPA) Our method outperforms the state-of-the-art PCQA methods on popular benchmarks.
arXiv Detail & Related papers (2024-03-15T07:16:07Z)
Exploring Opinion-unaware Video Quality Assessment with Semantic Affinity Criterion [52.07084862209754]
We introduce an explicit semantic affinity index for opinion-unaware VQA using text-prompts in the contrastive language-image pre-training model. We also aggregate it with different traditional low-level naturalness indexes through gaussian normalization and sigmoid rescaling strategies. The proposed Blind Unified Opinion-Unaware Video Quality Index via Semantic and Technical Metric Aggregation (BUONA-VISTA) outperforms existing opinion-unaware VQA methods by at least 20% improvements.
arXiv Detail & Related papers (2023-02-26T08:46:07Z)
Reduced-Reference Quality Assessment of Point Clouds via Content-Oriented Saliency Projection [17.983188216548005]
Many dense 3D point clouds have been exploited to represent visual objects instead of traditional images or videos. We propose a novel and efficient Reduced-Reference quality metric for point clouds.
arXiv Detail & Related papers (2023-01-18T18:00:29Z)
Progressive Knowledge Transfer Based on Human Visual Perception Mechanism for Perceptual Quality Assessment of Point Clouds [21.50682830021656]
A progressive knowledge transfer based on human visual perception mechanism for perceptual quality assessment of point clouds (PKT-PCQA) is proposed. Experiments on three large and independent point cloud assessment datasets show that the proposed no reference PKT-PCQA network achieves better of equivalent performance.
arXiv Detail & Related papers (2022-11-30T00:27:58Z)
Exploring CLIP for Assessing the Look and Feel of Images [87.97623543523858]
We introduce Contrastive Language-Image Pre-training (CLIP) models for assessing both the quality perception (look) and abstract perception (feel) of images in a zero-shot manner. Our results show that CLIP captures meaningful priors that generalize well to different perceptual assessments.
arXiv Detail & Related papers (2022-07-25T17:58:16Z)
No-Reference Image Quality Assessment by Hallucinating Pristine Features [24.35220427707458]
We propose a no-reference (NR) image quality assessment (IQA) method via feature level pseudo-reference (PR) hallucination. The effectiveness of our proposed method is demonstrated on four popular IQA databases.
arXiv Detail & Related papers (2021-08-09T16:48:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.