Fugu-MT 論文翻訳(概要): PIT-QMM: A Large Multimodal Model For No-Reference Point Cloud Quality Assessment

論文の概要: PIT-QMM: A Large Multimodal Model For No-Reference Point Cloud Quality Assessment

arxiv url: http://arxiv.org/abs/2510.07636v1
Date: Thu, 09 Oct 2025 00:13:34 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-10 17:54:14.787574
Title: PIT-QMM: A Large Multimodal Model For No-Reference Point Cloud Quality Assessment
Title（参考訳）: PIT-QMM: 参照不要なクラウド品質評価のための大規模マルチモーダルモデル
Authors: Shashank Gupta, Gregoire Phillips, Alan C. Bovik,
Abstract要約: 大規模マルチモーダルモデル(LMM)は近年,画像および映像品質評価の領域において,大幅な進歩を実現している。我々はこれらのモデルを用いて、No-Reference Point Cloud Quality Assessment (NR-PCQA)を実施することに興味があります。その目的は、参照のない点雲の知覚的品質を自動的に評価することである。 NR-PCQAのための新しいLMMであるPIT-QMMを構築し、テキスト、画像、点雲をエンドツーエンドに消費して品質スコアを予測する。
参考スコア（独自算出の注目度）: 26.896426878221718
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Multimodal Models (LMMs) have recently enabled considerable advances in the realm of image and video quality assessment, but this progress has yet to be fully explored in the domain of 3D assets. We are interested in using these models to conduct No-Reference Point Cloud Quality Assessment (NR-PCQA), where the aim is to automatically evaluate the perceptual quality of a point cloud in absence of a reference. We begin with the observation that different modalities of data - text descriptions, 2D projections, and 3D point cloud views - provide complementary information about point cloud quality. We then construct PIT-QMM, a novel LMM for NR-PCQA that is capable of consuming text, images and point clouds end-to-end to predict quality scores. Extensive experimentation shows that our proposed method outperforms the state-of-the-art by significant margins on popular benchmarks with fewer training iterations. We also demonstrate that our framework enables distortion localization and identification, which paves a new way forward for model explainability and interactivity. Code and datasets are available at https://www.github.com/shngt/pit-qmm.
Abstract（参考訳）: 大規模マルチモーダルモデル (LMM) は画像と映像の品質評価の領域において,近年かなりの進歩を遂げている。我々はこれらのモデルを用いて、参照のないポイントクラウドの知覚品質を自動評価するNR-PCQA(No-Reference Point Cloud Quality Assessment)を実施することに興味があります。まず、さまざまなデータ(テキスト記述、2Dプロジェクション、3Dポイントクラウドビュー)が、ポイントクラウドの品質に関する補完的な情報を提供する、という観察から始めます。 PIT-QMM は NR-PCQA 用の新しい LMM で,テキスト,画像,点雲を終末まで消費して品質スコアを予測できる。大規模な実験により,提案手法は,トレーニングイテレーションの少ない一般的なベンチマークにおいて,最先端の手法よりも優れた性能を示した。また、本フレームワークは歪みの局所化と同定を可能にし、モデル説明可能性と相互作用性に新たな道を開くことを実証する。コードとデータセットはhttps://www.github.com/shngt/pit-qmm.comで公開されている。

論文の概要: PIT-QMM: A Large Multimodal Model For No-Reference Point Cloud Quality Assessment

関連論文リスト