Fugu-MT 論文翻訳(概要): FED-Bench: A Cross-Granular Benchmark for Disentangled Evaluation of Facial Expression Editing

論文の概要: FED-Bench: A Cross-Granular Benchmark for Disentangled Evaluation of Facial Expression Editing

arxiv url: http://arxiv.org/abs/2603.29697v1
Date: Tue, 31 Mar 2026 12:52:17 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-01 15:25:03.668357
Title: FED-Bench: A Cross-Granular Benchmark for Disentangled Evaluation of Facial Expression Editing
Title（参考訳）: FED-Bench: 顔表情編集の切り離し評価のためのクロスグラニュラーベンチマーク
Authors: Fengjian Xue, Xuecheng Wu, Heli Sun, Yunyun Shi, Shi Chen, Liangyu Fu, Jinheng Xie, Dingkang Yang, Hao Wang, Junxiao Xue, Liang He,
Abstract要約: FED-Benchは厳格なテストと正確な評価スイートを備えた総合的なベンチマークである。我々は18の画像編集モデルをベンチマークし、現在のアプローチが高い忠実度と正確な表現操作を同時に達成するのに苦労していることを明らかにする。ベンチマークと関連するコードは近く公開される予定です。
参考スコア（独自算出の注目度）: 29.7144418122336
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Facial expression image editing requires fine-grained control to strictly preserve human identity and background while precisely manipulating expression. However, existing editing benchmarks primarily focus on general scenarios, lacking high-quality facial images and corresponding editing instructions. Furthermore, current evaluation metrics exhibit systemic biases in this task, often favoring lazy editing or overfit editing. To bridge these gaps, we propose FED-Bench, a comprehensive benchmark featuring rigorous testing and an accurate evaluation suite. First, we carefully construct a benchmark of 747 triplets through a cascaded and scalable pipeline, each comprising an original image, an editing instruction, and a ground-truth image for precise evaluation. Second, we introduce FED-Score, a cross-granularity evaluation protocol that disentangles assessment into three dimensions: Alignment for verifying instruction following, Fidelity for testing image quality and identity preservation, and Relative Expression Gain for quantifying the magnitude of expression changes, effectively mitigating the aforementioned evaluation biases. Third, we benchmark 18 image editing models, revealing that current approaches struggle to simultaneously achieve high fidelity and accurate expression manipulation, with fine-grained instruction following identified as the primary bottleneck. Finally, leveraging the scalable characteristic of introduced benchmark engine, we provide a 20k+ in-the-wild facial training set and demonstrate its effectiveness by fine-tuning a baseline model that achieves significant performance gains. Our benchmark and related code will be made publicly open soon.
Abstract（参考訳）: 表情を正確に操作しながら、人間のアイデンティティと背景を厳格に保持するために、表情画像編集はきめ細かな制御を必要とする。しかし、既存の編集ベンチマークは主に一般的なシナリオに焦点を当てており、高品質な顔画像とそれに対応する編集命令が欠けている。さらに、現在の評価指標は、このタスクの体系的なバイアスを示し、しばしば遅延編集や過度な編集を好む。これらのギャップを埋めるため、厳密なテストと正確な評価スイートを備えた総合的なベンチマークであるFED-Benchを提案する。まず, 原画像, 編集命令, 接地木画像からなる, カスケードでスケーラブルなパイプラインを用いて, 747 のトリップレットのベンチマークを慎重に構築し, 正確な評価を行う。第2に、評価を3次元に分散する粒度横断評価プロトコルFED-Score、画像の品質とアイデンティティの保持を検証するためのアライメント、表現の規模を定量化するための相対表現ゲインを導入し、上記の評価バイアスを効果的に軽減する。第3に、画像編集モデル18をベンチマークし、現在のアプローチは、高い忠実度と正確な表現操作を同時に達成するのに苦労していることを明らかにする。最後に、導入したベンチマークエンジンのスケーラビリティ特性を活用し、20k以上の顔トレーニングセットを提供し、性能向上を実現するベースラインモデルを微調整することで、その効果を実証する。ベンチマークと関連するコードは近く公開される予定です。

論文の概要: FED-Bench: A Cross-Granular Benchmark for Disentangled Evaluation of Facial Expression Editing

関連論文リスト