Fugu-MT 論文翻訳(概要): Neural Gate: Mitigating Privacy Risks in LVLMs via Neuron-Level Gradient Gating

論文の概要: Neural Gate: Mitigating Privacy Risks in LVLMs via Neuron-Level Gradient Gating

arxiv url: http://arxiv.org/abs/2603.12598v1
Date: Fri, 13 Mar 2026 03:03:20 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-16 17:38:11.865309
Title: Neural Gate: Mitigating Privacy Risks in LVLMs via Neuron-Level Gradient Gating
Title（参考訳）: Neural Gate: ニューロンレベルグラディエントゲーティングによるLVLMのプライバシリスクの軽減
Authors: Xiangkui Cao, Jie Zhang, Meina Kan, Shiguang Shan, Xilin Chen,
Abstract要約: ニューラルゲート(Neural Gate)は,ニューロンレベルのモデル編集によってプライバシリスクを軽減する新しい手法である。本手法は,プライバシ関連質問に対する拒否率を高めることにより,モデルのプライバシ保護を改善する。
参考スコア（独自算出の注目度）: 71.55435880263238
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Vision-Language Models (LVLMs) have shown remarkable potential across a wide array of vision-language tasks, leading to their adoption in critical domains such as finance and healthcare. However, their growing deployment also introduces significant security and privacy risks. Malicious actors could potentially exploit these models to extract sensitive information, highlighting a critical vulnerability. Recent studies show that LVLMs often fail to consistently refuse instructions designed to compromise user privacy. While existing work on privacy protection has made meaningful progress in preventing the leakage of sensitive data, they are constrained by limitations in both generalization and non-destructiveness. They often struggle to robustly handle unseen privacy-related queries and may inadvertently degrade a model's performance on standard tasks. To address these challenges, we introduce Neural Gate, a novel method for mitigating privacy risks through neuron-level model editing. Our method improves a model's privacy safeguards by increasing its rate of refusal for privacy-related questions, crucially extending this protective behavior to novel sensitive queries not encountered during the editing process. Neural Gate operates by learning a feature vector to identify neurons associated with privacy-related concepts within the model's representation of a subject. This localization then precisely guides the update of model parameters. Through comprehensive experiments on MiniGPT and LLaVA, we demonstrate that our method significantly boosts the model's privacy protection while preserving its original utility.
Abstract（参考訳）: LVLM(Large Vision-Language Models)は、さまざまな視覚言語タスクにおいて大きなポテンシャルを示しており、金融や医療といった重要な領域で採用されている。しかし、そのデプロイの増加は、セキュリティとプライバシの重大なリスクももたらします。悪意のあるアクターは、これらのモデルを利用して機密情報を抽出し、重大な脆弱性を浮き彫りにする可能性がある。最近の研究によると、LVLMはユーザーのプライバシーを侵害するための命令を一貫して拒否することができないことが多い。プライバシー保護に関するこれまでの研究は、機密データの漏洩を防ぐために有意義な進歩を遂げてきたが、それらは一般化と非破壊性の両方の制限によって制約されている。彼らはしばしば、目に見えないプライバシ関連のクエリを堅牢に処理するのに苦労し、標準タスクにおけるモデルのパフォーマンスを故意に低下させる可能性がある。これらの課題に対処するために、ニューラルゲート(Neural Gate)は、ニューロンレベルのモデル編集によってプライバシーリスクを軽減する新しい方法である。本手法は,プライバシ関連質問に対する拒否率を高め,この保護行動を編集プロセス中に発生しない新規なセンシティブなクエリに決定的に拡張することにより,モデルのプライバシ保護を改善する。 Neural Gateは機能ベクトルを学習して、モデルの主題表現内のプライバシ関連概念に関連するニューロンを識別する。このローカライゼーションはモデルパラメータの更新を正確に導く。 MiniGPT と LLaVA に関する総合的な実験を通じて,本手法はモデルのプライバシ保護を著しく促進し,元のユーティリティを保存できることを実証する。

論文の概要: Neural Gate: Mitigating Privacy Risks in LVLMs via Neuron-Level Gradient Gating

関連論文リスト