Fugu-MT 論文翻訳(概要): SafeGen-Bench: Benchmarking Safety in Image-Conditioned Text-to-Video Generation

論文の概要: SafeGen-Bench: Benchmarking Safety in Image-Conditioned Text-to-Video Generation

arxiv url: http://arxiv.org/abs/2606.01481v1
Date: Sun, 31 May 2026 22:46:35 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-02 21:34:29.728693
Title: SafeGen-Bench: Benchmarking Safety in Image-Conditioned Text-to-Video Generation
Title（参考訳）: SafeGen-Bench:画像中心のテキスト・ビデオ生成におけるベンチマーク安全性
Authors: Yingzi Ma, Xiaogeng Liu, Yawen Zheng, Chaowei Xiao,
Abstract要約: 本稿では,条件付きT2Vモデルの安全性を評価するためのベンチマークであるSafeGen-Benchを紹介する。我々のベンチマークでは、時間的シーケンスと表現された行動の両方に関連するリスクに焦点を当て、悪意のあるカテゴリを10つ定義している。 SafeGen-Bench上での様々な条件付きT2Vモデルの評価を行い、その結果、現在のモデルでは悪意のあるコンテンツの生成を一貫して避けることが困難であることが示唆された。
参考スコア（独自算出の注目度）: 46.016571758908015
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the rapid advancements in text-to-image diffusion models, generative video models (T2V models) like Sora can now produce short synthetic videos from a text prompt or an initial image. However, synthetic video generation -- especially when guided by an initial image -- often poses risks, including the potential creation of illegal, politically sensitive, or unethical content. Existing benchmarks have started to consider the safety of generated videos, but they primarily focus on testing models with malicious text prompts, ignoring the scenario where text prompt and image combination may still lead to harmful video content. In practice, this is a common and challenging issue: videos generated from safe text and image inputs can nonetheless convey harmful information. To bridge this gap, we introduce SafeGen-Bench, a benchmark specifically designed to evaluate the safety of conditional T2V models. Our benchmark defines 10 malicious categories, concentrating on risks related to both temporal sequences and depicted behaviors. SafeGen-Bench consists of carefully selected start frames from diverse image and video sources, paired with corresponding text prompts to simulate realistic inputs. We evaluate a variety of conditional T2V models on SafeGen-Bench, and the results indicate that current models struggle to consistently avoid generating malicious content with unsafety scores reaching up to 44.5, especially under conditions requiring high quality. Furthermore, we assess the effectiveness of both text-based and image-based guardrails on our benchmark, finding that unimodal guardrails alone were insufficient to provide a robust defense, with an 80\% failure rate across seven malicious categories. We hope that SafeGen-Bench will foster the development of safer and more controllable conditional T2V models.
Abstract（参考訳）: テキスト間拡散モデルの急速な進歩により、Soraのような生成ビデオモデル(T2Vモデル)は、テキストプロンプトや初期画像から短い合成ビデオを生成することができるようになった。しかし、合成ビデオ生成(特に初期画像でガイドされた場合)は、違法、政治的に敏感、または非倫理的なコンテンツの作成を含むリスクを伴うことが多い。既存のベンチマークでは、生成されたビデオの安全性を考慮し始めているが、主に悪意のあるテキストプロンプトによるテストモデルに焦点を当てており、テキストプロンプトと画像の組み合わせが有害なビデオコンテンツにつながるシナリオを無視している。安全なテキストと画像入力から生成されたビデオは、それでも有害な情報を伝達することができる。このギャップを埋めるために,条件付きT2Vモデルの安全性を評価するためのベンチマークであるSafeGen-Benchを導入する。我々のベンチマークでは、時間的シーケンスと表現された行動の両方に関連するリスクに焦点を当て、悪意のあるカテゴリを10つ定義している。 SafeGen-Benchは、多様な画像とビデオソースから慎重に選択されたスタートフレームで構成され、対応するテキストプロンプトと組み合わせて現実的な入力をシミュレートする。 SafeGen-Bench上での様々な条件付きT2Vモデルの評価を行い、特に高品質な条件下では、安全でないスコアが最大44.5まで到達した悪意のあるコンテンツの生成を、現行のモデルでは一貫して避けることが困難であることを示した。さらに,本ベンチマークにおけるテキストベースガードレールと画像ベースガードレールの有効性を評価した結果,不正ガードレールだけでは堅牢な防御が得られず,悪質な7つのカテゴリで80%の障害率を有することがわかった。我々はSafeGen-Benchがより安全で制御可能な条件付きT2Vモデルの開発を促進することを願っている。

論文の概要: SafeGen-Bench: Benchmarking Safety in Image-Conditioned Text-to-Video Generation

関連論文リスト