Fugu-MT 論文翻訳(概要): AgenticIQA: An Agentic Framework for Adaptive and Interpretable Image Quality Assessment

論文の概要: AgenticIQA: An Agentic Framework for Adaptive and Interpretable Image Quality Assessment

arxiv url: http://arxiv.org/abs/2509.26006v2
Date: Wed, 01 Oct 2025 04:01:40 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-02 14:33:21.837827
Title: AgenticIQA: An Agentic Framework for Adaptive and Interpretable Image Quality Assessment
Title（参考訳）: AgenticIQA: 適応的で解釈可能な画像品質評価のためのエージェントフレームワーク
Authors: Hanwei Zhu, Yu Tian, Keyan Ding, Baoliang Chen, Bolin Chen, Shiqi Wang, Weisi Lin,
Abstract要約: 画像品質評価(IQA)は、人間の視覚系に根ざした知覚品質の定量化と解釈の両方を反映している。 AgenticIQAは、IQAを歪み検出、歪み解析、ツール選択、ツール実行の4つのサブタスクに分解する。本稿では,IQAエージェントに適した大規模命令データセットであるAgenticIQA-200Kと,VLMベースのIQAエージェントの計画,実行,要約機能を評価するための最初のベンチマークであるAgenticIQA-Evalを紹介する。
参考スコア（独自算出の注目度）: 69.06977852423564
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Image quality assessment (IQA) is inherently complex, as it reflects both the quantification and interpretation of perceptual quality rooted in the human visual system. Conventional approaches typically rely on fixed models to output scalar scores, limiting their adaptability to diverse distortions, user-specific queries, and interpretability needs. Furthermore, scoring and interpretation are often treated as independent processes, despite their interdependence: interpretation identifies perceptual degradations, while scoring abstracts them into a compact metric. To address these limitations, we propose AgenticIQA, a modular agentic framework that integrates vision-language models (VLMs) with traditional IQA tools in a dynamic, query-aware manner. AgenticIQA decomposes IQA into four subtasks -- distortion detection, distortion analysis, tool selection, and tool execution -- coordinated by a planner, executor, and summarizer. The planner formulates task-specific strategies, the executor collects perceptual evidence via tool invocation, and the summarizer integrates this evidence to produce accurate scores with human-aligned explanations. To support training and evaluation, we introduce AgenticIQA-200K, a large-scale instruction dataset tailored for IQA agents, and AgenticIQA-Eval, the first benchmark for assessing the planning, execution, and summarization capabilities of VLM-based IQA agents. Extensive experiments across diverse IQA datasets demonstrate that AgenticIQA consistently surpasses strong baselines in both scoring accuracy and explanatory alignment.
Abstract（参考訳）: 画像品質評価(IQA)は、人間の視覚系に根ざした知覚品質の定量化と解釈の両方を反映しているため、本質的に複雑である。従来のアプローチでは、スカラースコアを出力するための固定モデルに依存しており、様々な歪み、ユーザ固有のクエリ、解釈可能性のニーズへの適応性を制限している。さらに、スコアリングと解釈はしばしば、相互依存にもかかわらず独立したプロセスとして扱われる:解釈は知覚的劣化を識別し、スコアリングはそれらをコンパクトな計量に抽象化する。本稿では,視覚言語モデル(VLM)と従来のIQAツールを統合するモジュール型エージェントフレームワークであるAgenticIQAを提案する。 AgenticIQAはIQAを4つのサブタスク(歪み検出、歪み解析、ツールの選択、ツール実行)に分解する。プランナーはタスク固有の戦略を定式化し、実行者はツールの呼び出しを通じて知覚的証拠を収集し、要約器は、この証拠を統合して人間に沿った説明と正確なスコアを生成する。本稿では,IQAエージェントに適した大規模命令データセットであるAgenticIQA-200Kと,VLMベースのIQAエージェントの計画,実行,要約機能を評価するための最初のベンチマークであるAgenticIQA-Evalを紹介する。多様なIQAデータセットにわたる大規模な実験により、AgenticIQAは評価精度と説明アライメントの両方において、強いベースラインを一貫して超越していることが示された。

論文の概要: AgenticIQA: An Agentic Framework for Adaptive and Interpretable Image Quality Assessment

関連論文リスト