Fugu-MT 論文翻訳(概要): Yuvion VL: A Multimodal Foundation Model for Adversarial Content and AI Safety

論文の概要: Yuvion VL: A Multimodal Foundation Model for Adversarial Content and AI Safety

arxiv url: http://arxiv.org/abs/2606.25034v1
Date: Tue, 23 Jun 2026 18:00:08 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-25 17:05:30.107823
Title: Yuvion VL: A Multimodal Foundation Model for Adversarial Content and AI Safety
Title（参考訳）: Yuvion VL: 敵対的コンテンツとAI安全性のためのマルチモーダルファンデーションモデル
Authors: Shikai Qiu, Xiaowen Xu, Benlei Cui, Ting Ma, Xiufeng Huang, Wenjing Jiang, Shaoxuan He, Haolei Xu, Chunyang Chai, Yujian Li, Yiliang Zhang, Guanghui Wang, Ziheng Wang, Ziwen Xu, Zhaoyu Fan, Jinhao Chen, Ruijie Jian, Hongxing Li, Chuxi Xiao, Xinyue Chen, Wenxuan Liu, Libin Dong, Yupeng Cao, Xiaoqian Xia, Jing Wang, Zhe Jiang, Zhenan Ye, Guang Yang, Bin Liu, Wei Peng, Ziqiang Zhu, Meihui Lian, Kaiwen Lv Kacuila, Haidong Ding, Dongjie Zhang, Yangfan Zhou, Bingyu Zhu, Yan Wang, Hai Zhao, Xuan Jin, Wei Zhao, Pengfei Sun, Huiming Zhang, Wei Wang, Xipeng Cao, Bin Li, Chengwen Yao, Meng Huang, Xianfeng Li, Bin Tang, Chao Liu, Hui Xue, Longtao Huang, Haiwen Hong,
Abstract要約: 汎用モデルは、しばしば実世界のマルチモーダルリスクを確実に識別し理解するのに苦労する。我々は、コンテンツとAIの安全性のために構築されたマルチモーダルな大規模言語モデルのファミリーであるYuvion VLを紹介する。
参考スコア（独自算出の注目度）: 73.67475847784357
License: http://creativecommons.org/licenses/by/4.0/
Abstract: General-purpose models often struggle to reliably identify and understand real-world multimodal risks, largely due to the inherent multimodal adversarial nature of content and AI safety. We present Yuvion VL, a family of multimodal large language models purpose-built for content and AI safety, with both instruction-tuned and reasoning-oriented variants. Yuvion VL addresses this gap by treating safety as an inherently adversarial and multimodal problem and designing the entire pipeline around adversarial robustness. For data construction, we develop an automated pipeline integrating adversarial-aware data synthesis with multi-stage quality control, producing large-scale, high-quality multimodal samples augmented with domain knowledge and reasoning annotations. For training, we adopt a three-stage pipeline that includes continued pretraining for risk-concept cross-modal alignment, instruct post-training for production-grade safety tasks, and reasoning post-training for enhanced interpretability and performance in complex tasks. We further introduce Confuse-then-Contrast Fine-Tuning, a contrastive framework that mines model-specific confusions and constructs multi-image contrastive groups to enforce explicit discrimination of fine-grained visual-semantic elements, enabling the model to distinguish between visually similar cases with different safety implications in adversarial safety tasks. To support rigorous evaluation, we further introduce Yuvion VL RiskEval (YVRE), a collection of benchmarks covering diverse open and internal evaluations, with a focus on content and AI safety, adversarial robustness, and real-world capability requirements. Experiments show that Yuvion VL-32B achieves industry-leading safety performance, surpassing comparably sized open-source models and best closed-source commercial models, while maintaining comparable general capabilities.
Abstract（参考訳）: 汎用モデルは、コンテンツとAIの安全性の本質的にのマルチモーダル対向性のために、現実のマルチモーダルリスクを確実に識別し理解するのに苦労することが多い。我々は、コンテンツとAIの安全性のために構築されたマルチモーダルな大規模言語モデルのファミリーであるYuvion VLを紹介します。ユビオンVLはこのギャップに対処し、安全を本質的に敵対的かつマルチモーダルな問題として扱い、対向的堅牢性を中心としたパイプライン全体を設計する。データ構築のために,ドメイン知識と推論アノテーションを付加した大規模かつ高品質なマルチモーダルサンプルを生成する,対向認識データ合成と多段階品質制御を統合した自動パイプラインを開発する。トレーニングには、リスクコンセプトのクロスモーダルアライメントのための継続的な事前トレーニング、プロダクショングレードの安全タスクのためのポストトレーニングの指示、複雑なタスクにおける解釈可能性とパフォーマンスの向上のためのポストトレーニングの推論を含む3段階のパイプラインを採用する。さらに、モデル固有の混乱をマイニングし、マルチイメージのコントラストグループを構築して、細粒度の視覚的意味要素の明示的な識別を強制するコントラストフレームワークであるConfuse-then-Contrast Fine-Tuningを導入する。厳格な評価をサポートするために、ユビオンVLリスクEval(YVRE)についても紹介する。これは、さまざまなオープンおよび内部評価をカバーし、コンテンツとAI安全性、敵の堅牢性、実世界の能力要件に焦点を当てたベンチマークである。実験により、Yuvion VL-32Bは業界をリードする安全性能を達成し、比較可能なサイズのオープンソースモデルと最高のクローズドソース商用モデルを超えながら、同等の汎用性を維持していることが示された。

論文の概要: Yuvion VL: A Multimodal Foundation Model for Adversarial Content and AI Safety

関連論文リスト