Fugu-MT 論文翻訳(概要): SIQA: Toward Reliable Scientific Image Quality Assessment

論文の概要: SIQA: Toward Reliable Scientific Image Quality Assessment

arxiv url: http://arxiv.org/abs/2603.06700v1
Date: Thu, 05 Mar 2026 06:57:26 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-10 15:13:12.892236
Title: SIQA: Toward Reliable Scientific Image Quality Assessment
Title（参考訳）: SIQA: 信頼性の高い画像品質評価を目指して
Authors: Wenzhe Li, Liang Chen, Junying Wang, Yijing Guo, Ye Shen, Farong Wen, Chunyi Li, Zicheng Zhang, Guangtao Zhai,
Abstract要約: 我々は,2つの相補的な次元に沿って,科学的画質をモデル化するフレームワークであるSIQA(Scientific Image Quality Assessment)を紹介する。 SIQA-U (Understanding), SIQA-S (Scoring), SIQA-U (Understanding), SIQA-U (Understanding), SIQA-U (Understanding), SIQA-U (Understanding), SIQA-U (Understanding) の2つの評価プロトコルを設計した。代表的マルチモーダル大言語モデル(MLLM)に対する実験は、アライメントアライメントと科学的理解の間に一貫した相違が見られる。
参考スコア（独自算出の注目度）: 72.41803245808924
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Scientific images fundamentally differ from natural and AI-generated images in that they encode structured domain knowledge rather than merely depict visual scenes. Assessing their quality therefore requires evaluating not only perceptual fidelity but also scientific correctness and logical completeness. However, existing image quality assessment (IQA) paradigms primarily focus on perceptual distortions or image-text alignment, implicitly assuming that depicted content is factually valid. This assumption breaks down in scientific contexts, where visually plausible figures may still contain conceptual errors or incomplete reasoning. To address this gap, we introduce Scientific Image Quality Assessment (SIQA), a framework that models scientific image quality along two complementary dimensions: Knowledge (Scientific Validity and Scientific Completeness) and Perception (Cognitive Clarity and Disciplinary Conformity). To operationalize this formulation, we design two evaluation protocols: SIQA-U (Understanding), which measures semantic comprehension of scientific content through multiple-choice tasks, and SIQA-S (Scoring), which evaluates alignment with expert quality judgments. We further construct the SIQA Challenge, consisting of an expert-annotated benchmark and a large-scale training set. Experiments across representative multimodal large language models (MLLMs) reveal a consistent discrepancy between scoring alignment and scientific understanding. While models can achieve strong agreement with expert ratings under SIQA-S, their performance on SIQA-U remains substantially lower. Fine-tuning improves both metrics, yet gains in scoring consistently outpace improvements in understanding. These results suggest that rating consistency alone may not reliably reflect scientific comprehension, underscoring the necessity of multidimensional evaluation for scientific image quality assessment.
Abstract（参考訳）: 科学画像は、視覚的なシーンを描写するだけでなく、構造化されたドメイン知識をエンコードするという点で、自然やAI生成の画像と根本的に異なる。そのため、それらの品質を評価するには、知覚の忠実さだけでなく、科学的正確性や論理的完全性を評価する必要がある。しかし、既存の画像品質評価(IQA)パラダイムは主に知覚の歪みや画像テキストのアライメントに焦点を当てており、描写されたコンテンツが事実上有効であると暗黙的に仮定している。この仮定は科学的文脈において破られ、視覚的にもっともらしい数字は、概念上の誤りや不完全な推論を含むことがある。このギャップに対処するために、科学画像品質評価(SIQA)という2つの相補的な次元、すなわち知識(科学的妥当性と科学的完全性)と知覚(認知的明瞭さと学際的整合性)に沿って科学的画像品質をモデル化するフレームワークを紹介した。この定式化を運用するには、複数の選択タスクを通して科学的内容の意味的理解を測定するSIQA-U(Understanding)と、専門的品質判断と整合性を評価するSIQA-S(Scoring)の2つの評価プロトコルを設計する。 SIQAチャレンジをさらに構築し、エキスパートアノテートされたベンチマークと大規模なトレーニングセットで構成される。代表的マルチモーダル大言語モデル(MLLM)に対する実験は、アライメントアライメントと科学的理解の間に一貫した相違が見られる。モデルはSIQA-Sのエキスパート評価と強く一致しているが、SIQA-Uの性能は依然としてかなり低い。微調整は両方のメトリクスを改善するが、スコアリングは理解の改善を継続的に上回る。これらの結果から, 評価一貫性だけでは科学的理解を確実に反映しない可能性が示唆され, 画像品質評価における多次元評価の必要性が示唆された。

論文の概要: SIQA: Toward Reliable Scientific Image Quality Assessment

関連論文リスト