Fugu-MT 論文翻訳(概要): ReasonEdit: Towards Interpretable Image Editing Evaluation via Reinforcement Learning

論文の概要: ReasonEdit: Towards Interpretable Image Editing Evaluation via Reinforcement Learning

arxiv url: http://arxiv.org/abs/2605.07477v1
Date: Fri, 08 May 2026 09:23:26 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-11 19:43:38.949927
Title: ReasonEdit: Towards Interpretable Image Editing Evaluation via Reinforcement Learning
Title（参考訳）: ReasonEdit:強化学習による解釈可能な画像編集評価を目指して
Authors: Honghua Chen, Zitong Xu, Huiyu Duan, Xinyun Zhang, Xiongkuo Min, Guangtao Zhai,
Abstract要約: 本稿では,テキスト誘導画像編集のための評価ツールReasonEditを紹介する。 Re-Reward と Group Relative Policy Optimization (GRPO) アルゴリズムから得られる報奨信号を用いて訓練する。高品質な解釈可能な評価テキストを生成することができ、画像編集の透明性と信頼性を高めることができる。
参考スコア（独自算出の注目度）: 86.61218827780675
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent text-guided image editing (TIE) models have achieved remarkable progress, however, many edited results still suffer from artifacts, unintended modifications, and suboptimal aesthetics. Although several benchmarks and evaluation methods have been proposed, most existing approaches rely on scalar scores and lack interpretability. This limitation largely stems from the absence of high-quality interpretation datasets for TIE and effective reward models to train interpretable evaluators. To address these challenges, we introduce ReasonEdit-22K, the first dataset that combines 22K edited images with 113K Chain-of-Thought (CoT) samples, along with 1.3M human judgments assessing these interpretations in terms of logicality, accuracy, and usefulness. Building upon this dataset, we propose RE-Reward, a multimodal large language model (MLLM)-based reward model designed to provide human-aligned feedback for evaluating interpretable reasoning in image editing. Furthermore, we develop ReasonEdit, which is trained using reward signals derived from RE-Reward and the Group Relative Policy Optimization (GRPO) algorithm to learn an interpretable evaluation model. Extensive experiments demonstrate that ReasonEdit achieves superior alignment with human preferences and exhibits strong generalization across public benchmarks. In addition, it is capable of generating high-quality interpretable evaluation text, enabling more transparent and trustworthy assessment for image editing. The code is available at https://github.com/IntMeGroup/ReasonEdit.
Abstract（参考訳）: 最近のテキスト誘導画像編集(TIE)モデルは目覚ましい進歩を遂げているが、多くの編集結果はまだアーティファクト、意図しない修正、そして準最適美学に悩まされている。いくつかのベンチマークや評価手法が提案されているが、既存のアプローチのほとんどはスカラースコアに依存し、解釈可能性に欠ける。この制限は、TIEのための高品質な解釈データセットや、解釈可能な評価器を訓練するための効果的な報酬モデルが欠如していることに起因している。これらの課題に対処するため、最初のデータセットであるReasonEdit-22Kを導入し、22Kの編集画像と113KのChain-of-Thought(CoT)サンプルを組み合わせた。このデータセットに基づいて、画像編集における解釈可能な推論を評価するための人間によるフィードバックを提供するために、マルチモーダルな大規模言語モデル(MLLM)に基づく報酬モデルであるRE-Rewardを提案する。さらに,Re-Reward と Group Relative Policy Optimization (GRPO) アルゴリズムから得られる報酬信号を用いて,解釈可能な評価モデルを学ぶReasonEdit を開発した。大規模な実験により、ReasonEditは人間の好みと優れた整合性を達成し、公開ベンチマーク全体にわたって強力な一般化を示すことが示されている。さらに、高品質な解釈可能な評価テキストを生成することができ、画像編集の透明性と信頼性を高めることができる。コードはhttps://github.com/IntMeGroup/ReasonEditで入手できる。

論文の概要: ReasonEdit: Towards Interpretable Image Editing Evaluation via Reinforcement Learning

関連論文リスト