Fugu-MT 論文翻訳(概要): Can VLMs Truly Forget? Benchmarking Training-Free Visual Concept Unlearning

論文の概要: Can VLMs Truly Forget? Benchmarking Training-Free Visual Concept Unlearning

arxiv url: http://arxiv.org/abs/2604.03114v1
Date: Fri, 03 Apr 2026 15:36:00 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-06 17:20:24.512937
Title: Can VLMs Truly Forget? Benchmarking Training-Free Visual Concept Unlearning
Title（参考訳）: VLMは本当に忘れられるのか? トレーニング不要のビジュアルコンセプトのベンチマーク
Authors: Zhangyun Tan, Zeliang Zhang, Susan Liang, Yolo Yunlong Tang, Lisha Chen, Chenliang Xu,
Abstract要約: 訓練ベースのアンラーニング手法は構造的な欠陥を共有している。狭義の忘れ物セットの微調整は、アンラーニングが始まる前に一般的な能力を低下させる。 VLM-UnBenchは、VLMにおけるトレーニング不要な視覚概念の学習のための最初のベンチマークである。
参考スコア（独自算出の注目度）: 40.223816530250055
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: VLMs trained on web-scale data retain sensitive and copyrighted visual concepts that deployment may require removing. Training-based unlearning methods share a structural flaw: fine-tuning on a narrow forget set degrades general capabilities before unlearning begins, making it impossible to attribute subsequent performance drops to the unlearning procedure itself. Training-free approaches sidestep this by suppressing concepts through prompts or system instructions, but no rigorous benchmark exists for evaluating them on visual tasks. We introduce VLM-UnBench, the first benchmark for training-free visual concept unlearning in VLMs. It covers four forgetting levels, 7 source datasets, and 11 concept axes, and pairs a three-level probe taxonomy with five evaluation conditions to separate genuine forgetting from instruction compliance. Across 8 evaluation settings and 13 VLM configurations, realistic unlearning prompts leave forget accuracy near the no-instruction baseline; meaningful reductions appear only under oracle conditions that disclose the target concept to the model. Object and scene concepts are the most resistant to suppression, and stronger instruction-tuned models remain capable despite explicit forget instructions. These results expose a clear gap between prompt-level suppression and true visual concept erasure.
Abstract（参考訳）: Webスケールのデータに基づいてトレーニングされたVLMは、デプロイに必要な機密的で著作権のある視覚概念を保持します。訓練ベースのアンラーニング手法は構造的な欠陥を共有している。狭義の忘れ物セットの微調整は、アンラーニングが始まる前に一般的な能力を低下させるため、その後のパフォーマンス低下をアンラーニング手順自体に原因付けることは不可能である。トレーニングなしのアプローチは、プロンプトやシステム命令を通じて概念を抑圧することでこれを後押しするが、視覚的なタスクでそれらを評価するための厳密なベンチマークは存在しない。 VLM-UnBenchは、VLMにおけるトレーニング不要な視覚概念の学習のための最初のベンチマークである。 4つの忘れているレベル、7つのソースデータセット、11のコンセプト軸をカバーし、5つの評価条件を持つ3レベルのプローブ分類と組み合わせることで、真に忘れていることを命令コンプライアンスから分離する。 8つの評価設定と13のVLM構成で、現実的な未学習は、非命令ベースラインの近傍で忘れの精度を保ち、ターゲット概念をモデルに開示するオラクル条件下でのみ有意義な削減が現れる。オブジェクトとシーンの概念は抑制に最も抵抗性があり、強い命令調整されたモデルははっきりと忘れられた命令にもかかわらず機能する。これらの結果から,プロンプトレベル抑制と真の視覚的概念消去との明確なギャップが明らかとなった。

論文の概要: Can VLMs Truly Forget? Benchmarking Training-Free Visual Concept Unlearning

関連論文リスト