Related papers: Knowledge Vector Weakening: Efficient Training-free Unlearning for Large Vision-Language Models

Knowledge Vector Weakening: Efficient Training-free Unlearning for Large Vision-Language Models

URL: http://arxiv.org/abs/2601.21794v1
Date: Thu, 29 Jan 2026 14:41:01 GMT
Title: Knowledge Vector Weakening: Efficient Training-free Unlearning for Large Vision-Language Models
Authors: Yejin Kim, Dongjun Hwang, Sungmin Cha, Junsuk Choe,
Abstract summary: Knowledge Vector Weakening (KVW) is a training-free unlearning method that directly intervenes in the full model without gradient computation.<n> Experiments on the MLLMU and CLEAR benchmarks demonstrate that KVW achieves a stable forget-retain trade-off while significantly improving computational efficiency.
Score: 16.045233710264842
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Large Vision-Language Models (LVLMs) are widely adopted for their strong multimodal capabilities, yet they raise serious concerns such as privacy leakage and harmful content generation. Machine unlearning has emerged as a promising solution for removing the influence of specific data from trained models. However, existing approaches largely rely on gradient-based optimization, incurring substantial computational costs for large-scale LVLMs. To address this limitation, we propose Knowledge Vector Weakening (KVW), a training-free unlearning method that directly intervenes in the full model without gradient computation. KVW identifies knowledge vectors that are activated during the model's output generation on the forget set and progressively weakens their contributions, thereby preventing the model from exploiting undesirable knowledge. Experiments on the MLLMU and CLEAR benchmarks demonstrate that KVW achieves a stable forget-retain trade-off while significantly improving computational efficiency over gradient-based and LoRA-based unlearning methods.

Related papers

CATNIP: LLM Unlearning via Calibrated and Tokenized Negative Preference Alignment [14.853204323785334]
Existing approaches, rooted in Gradient Ascent (GA), often degrade general domain knowledge while relying on retention data or curated contrastive pairs.<n>We develop a principled method that rescales unlearning effects in proportion to the model's token-level confidence.<n>Our work enables effective unlearning without requiring retention data or contrastive unlearning response pairs.
arXiv Detail & Related papers (2026-02-02T21:23:54Z)
Sparsity-Aware Unlearning for Large Language Models [20.699929903336113]
Large Language Models (LLMs) inevitably memorize sensitive information during training, posing significant privacy risks.<n>Machine unlearning has emerged as a promising solution to selectively remove such information without full retraining.<n>We find that unlearning effectiveness degrades substantially on sparse models.<n>We propose Sparsity-Aware Unlearning (SAU), which decouples unlearning from sparsification objectives through gradient masking.
arXiv Detail & Related papers (2026-01-31T07:45:30Z)
Forgetting-MarI: LLM Unlearning via Marginal Information Regularization [6.979586479353831]
Existing unlearning methods often degrade model performance by removing more information than necessary when attempting to ''forget'' specific data.<n>We introduce Forgetting-MarI, an LLM unlearning framework that provably removes only the additional (marginal) information contributed by the data to be unlearned.<n>By penalizing marginal information, our method yields an explicit upper bound on the unlearn dataset's residual influence in the trained models, providing provable undetectability.
arXiv Detail & Related papers (2025-11-14T22:48:39Z)
AUVIC: Adversarial Unlearning of Visual Concepts for Multi-modal Large Language Models [63.05306474002547]
Regulatory frameworks mandating the 'right to be forgotten' drive the need for machine unlearning.<n>We introduce AUVIC, a novel visual concept unlearning framework for MLLMs.<n>We show that AUVIC achieves state-of-the-art target forgetting rates while incurs minimal performance degradation on non-target concepts.
arXiv Detail & Related papers (2025-11-14T13:35:32Z)
MLLMEraser: Achieving Test-Time Unlearning in Multimodal Large Language Models through Activation Steering [36.80441487363007]
MLLMEraser is an input-aware, training-free framework for test-time unlearning.<n>We construct a multimodal erasure direction by contrasting adversarially perturbed, knowledge-recall image-text pairs.<n>Experiments on LLaVA-1.5 and Qwen-2.5-VL demonstrate that MLLMEraser consistently outperforms state-of-the-art MLLM unlearning baselines.
arXiv Detail & Related papers (2025-10-05T14:20:17Z)
Efficient Machine Unlearning via Influence Approximation [75.31015485113993]
Influence-based unlearning has emerged as a prominent approach to estimate the impact of individual training samples on model parameters without retraining.<n>This paper establishes a theoretical link between memorizing (incremental learning) and forgetting (unlearning)<n>We introduce the Influence Approximation Unlearning algorithm for efficient machine unlearning from the incremental perspective.
arXiv Detail & Related papers (2025-07-31T05:34:27Z)
Efficient Knowledge Feeding to Language Models: A Novel Integrated Encoder-Decoder Architecture [0.0]
ICV recasts in-context learning by using latent embeddings of language models.<n>ICV directly integrates information into the model, enabling it to process this information more effectively.
arXiv Detail & Related papers (2025-02-07T04:24:07Z)
Learning with Less: Knowledge Distillation from Large Language Models via Unlabeled Data [54.934578742209716]
In real-world NLP applications, Large Language Models (LLMs) offer promising solutions due to their extensive training on vast datasets.<n>LLKD is an adaptive sample selection method that incorporates signals from both the teacher and student.<n>Our comprehensive experiments show that LLKD achieves superior performance across various datasets with higher data efficiency.
arXiv Detail & Related papers (2024-11-12T18:57:59Z)
Learn while Unlearn: An Iterative Unlearning Framework for Generative Language Models [52.40798352740857]
We introduce the Iterative Contrastive Unlearning (ICU) framework, which consists of three core components.<n>A Knowledge Unlearning Induction module targets specific knowledge for removal using an unlearning loss.<n>A Contrastive Learning Enhancement module preserves the model's expressive capabilities against the pure unlearning goal.<n>An Iterative Unlearning Refinement module dynamically adjusts the unlearning process through ongoing evaluation and updates.
arXiv Detail & Related papers (2024-07-25T07:09:35Z)
Towards Effective Evaluations and Comparisons for LLM Unlearning Methods [97.2995389188179]
This paper seeks to refine the evaluation of machine unlearning for large language models.<n>It addresses two key challenges -- the robustness of evaluation metrics and the trade-offs between competing goals.
arXiv Detail & Related papers (2024-06-13T14:41:00Z)
Offset Unlearning for Large Language Models [49.851093293780615]
delta-Unlearning is an offset unlearning framework for black-box LLMs.<n>We show that delta-Unlearning can effectively unlearn target data while maintaining similar or even stronger performance on general out-of-forget-scope tasks.
arXiv Detail & Related papers (2024-04-17T03:39:51Z)
Transfer Learning without Knowing: Reprogramming Black-box Machine Learning Models with Scarce Data and Limited Resources [78.72922528736011]
We propose a novel approach, black-box adversarial reprogramming (BAR), that repurposes a well-trained black-box machine learning model. Using zeroth order optimization and multi-label mapping techniques, BAR can reprogram a black-box ML model solely based on its input-output responses. BAR outperforms state-of-the-art methods and yields comparable performance to the vanilla adversarial reprogramming method.
arXiv Detail & Related papers (2020-07-17T01:52:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.