Fugu-MT 論文翻訳(概要): Bridging the Gap Between Ideal and Real-world Evaluation: Benchmarking AI-Generated Image Detection in Challenging Scenarios

論文の概要: Bridging the Gap Between Ideal and Real-world Evaluation: Benchmarking AI-Generated Image Detection in Challenging Scenarios

arxiv url: http://arxiv.org/abs/2509.09172v1
Date: Thu, 11 Sep 2025 06:15:52 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-12 16:52:24.245965
Title: Bridging the Gap Between Ideal and Real-world Evaluation: Benchmarking AI-Generated Image Detection in Challenging Scenarios
Title（参考訳）: 理想と実世界の評価のギャップを埋める: カオスシナリオにおけるAI生成画像検出のベンチマーク
Authors: Chunxiao Li, Xiaoxiao Wang, Meiling Li, Boming Miao, Peng Sun, Yunjian Zhang, Xiangyang Ji, Yao Zhu,
Abstract要約: 本稿では,実世界ロバストネスデータセット(RRDataset)を導入し,3次元にわたる検出モデルの包括的評価を行う。 RRDatasetには7つの主要なシナリオの高品質なイメージが含まれている。我々はRRDataset上で17の検出器と10の視覚言語モデル(VLM)をベンチマークし、大規模な人間実験を行った。
参考スコア（独自算出の注目度）: 54.07895223545793
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the rapid advancement of generative models, highly realistic image synthesis has posed new challenges to digital security and media credibility. Although AI-generated image detection methods have partially addressed these concerns, a substantial research gap remains in evaluating their performance under complex real-world conditions. This paper introduces the Real-World Robustness Dataset (RRDataset) for comprehensive evaluation of detection models across three dimensions: 1) Scenario Generalization: RRDataset encompasses high-quality images from seven major scenarios (War and Conflict, Disasters and Accidents, Political and Social Events, Medical and Public Health, Culture and Religion, Labor and Production, and everyday life), addressing existing dataset gaps from a content perspective. 2) Internet Transmission Robustness: examining detector performance on images that have undergone multiple rounds of sharing across various social media platforms. 3) Re-digitization Robustness: assessing model effectiveness on images altered through four distinct re-digitization methods. We benchmarked 17 detectors and 10 vision-language models (VLMs) on RRDataset and conducted a large-scale human study involving 192 participants to investigate human few-shot learning capabilities in detecting AI-generated images. The benchmarking results reveal the limitations of current AI detection methods under real-world conditions and underscore the importance of drawing on human adaptability to develop more robust detection algorithms.
Abstract（参考訳）: 生成モデルの急速な進歩により、高度に現実的な画像合成は、デジタルセキュリティとメディアの信頼性に新たな課題をもたらしている。 AI生成画像検出手法はこれらの懸念に部分的に対処しているが, 複雑な実環境下での性能評価において, かなりの研究ギャップが残っている。実世界ロバストネスデータセット(RRDataset:Real-World Robustness Dataset)を紹介する。 1)シナリオの一般化:RRDatasetは、7つの主要なシナリオ(戦争・紛争・災害・事故・政治・社会イベント・医療・公衆衛生・文化・宗教・労働・生産・日常生活)から高品質なイメージを包含し、コンテンツの観点から既存のデータセットギャップに対処する。 2)インターネット・トランスミッション・ロバストネス:様々なソーシャルメディア・プラットフォーム上で複数回共有された画像の検出器性能を調べる。 3)再デジタル化ロバストネス:4つの異なる再デジタル化手法により変化した画像におけるモデルの有効性を評価する。我々は、RRDataset上で17の検出器と10の視覚言語モデル(VLM)をベンチマークし、192人の参加者による大規模な人間による研究を行い、AI生成画像の検出において、人間の数発学習能力を調査した。ベンチマークの結果は、現実の条件下での現在のAI検出方法の限界を明らかにし、より堅牢な検出アルゴリズムを開発するための人間の適応性への描画の重要性を強調している。

論文の概要: Bridging the Gap Between Ideal and Real-world Evaluation: Benchmarking AI-Generated Image Detection in Challenging Scenarios

関連論文リスト